问题现象
Unable to find config file hive-site.xml
Unable to find config file hivemetastore-site.xml
Unable to find config file metastore-site.xml
本文记录这个问题是如何导致的,并记录如何向 Hive、Hudi 提供 hive-site.xml 以便正确加载。
问题分析: HiveMetaStore 是如何查找配置文件路径的
位置:org.apache.hadoop.hive.metastore.conf.MetastoreConf#findConfigFile
private static URL findConfigFile(ClassLoader classLoader, String name) {// First, look in the classpathURL result = classLoader.getResource(name);if (result == null) {// Nope, so look to see if our conf dir has been explicitly setresult = seeIfConfAtThisLocation("METASTORE_CONF_DIR", name, false);if (result == null) {// Nope, so look to see if our home dir has been explicitly setresult = seeIfConfAtThisLocation("METASTORE_HOME", name, true);if (result == null) {// Nope, so look to see if Hive's conf dir has been explicitly setresult = seeIfConfAtThisLocation("HIVE_CONF_DIR", name, false);if (result == null) {// Nope, so look to see if Hive's home dir has been explicitly setresult = seeIfConfAtThisLocation("HIVE_HOME", name, true);if (result == null) {// Nope, so look to see if we can find a conf file by finding our jar, going up one// directory, and looking for a conf directory.URI jarUri = null;try {jarUri = MetastoreConf.class.getProtectionDomain().getCodeSource().getLocation().toURI();} catch (Throwable e) {LOG.warn("Cannot get jar URI", e);}result = seeIfConfAtThisLocation(new File(jarUri).getParent(), name, true);// At this point if we haven't found it, screw it, we don't know where it isif (result == null) {LOG.info("Unable to find config file " + name);}}}}}}LOG.info("Found configuration file " + result);return result;}
显然是因为 classpath 没有,METASTORE_CONF_DIR、METASTORE_HOME、HIVE_CONF_DIR、HIVE_HOME, 这些位置相应的都没有
并且甚至 MetastoreConf 类所在的 jar 包内也没有
寻找原因:为什么所有的位置都没有读取到 hive-site.xml
位置:org.apache.hadoop.hive.metastore.conf.MetastoreConf#newMetastoreConf
if(hiveSiteURL == null) {/** this 'if' is pretty lame - QTestUtil.QTestUtil() uses hiveSiteURL to load a specific* hive-site.xml from data/conf/<subdir> so this makes it follow the same logic - otherwise* HiveConf and MetastoreConf may load different hive-site.xml ( For example,* HiveConf uses data/conf/spark/hive-site.xml and MetastoreConf data/conf/hive-site.xml)*/hiveSiteURL = findConfigFile(classLoader, "hive-site.xml");}if (hiveSiteURL != null) {conf.addResource(hiveSiteURL);}
当 hiveSiteURL 静态变量未设置的时候,才调用 findConfigFile,这个是正常情况。
Flink 相关的
位置:
org.apache.flink.table.catalog.hive.HiveCatalog.createHiveConf
// ignore all the static conf file URLs that HiveConf may have setHiveConf.setHiveSiteLocation(null);
结论:
- Flink 清理了这个静态变量,导致进入 findConfigFile。
- MetastoreConf 看样子不支持 HDFS上的 hive-site.xml
- Flink 如果new了HiveCatalog,一定导致查找过程
CLASSPATH 分析
Flink的 CLASSPATH 已经提供了为何仍然加载不了 hive-site.xml
lib/hive-site.xml
但 classLoader.getResource(name); 仍然加载不了,推测是因为 name应当是 "lib/hive-site.xml" 才能正确加载 ?
结论:
需要指定 HIVE_CONF_DIR
解决方案
给 Flink 程序传入 HIVE_CONF_DIR,那么具体怎么做的?可以参考 kyuubi
即:
-Dcontainerized.master.env.HIVE_CONF_DIR=/etc/hive/conf