Airflow 从入门到精通-02-迁移后台 DB 到 MySQL

一、本地安装MySQL

本地使用Docker安装MySQL,MySQL版本为5.7

docker run --name mysql -p 3307:3306 -e MYSQL_ROOT_PASSWORD=Quant888 -d mysql:5.7 --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
create database airflow CHARACTER SET utf8;
use airflow;

二、修改配置文件airflow.cfg

executor = LocalExecutor
# sql_alchemy_conn = mysql://root:Quant888@98.14.132.145:3307/airflow
# 需要加charset=utf8,否则后边初始化时还会报错
sql_alchemy_conn = mysql://root:Quant888@98.14.132.145:3307/airflow?charset=utf8

三、初始化数据库

airflow db init
若之前使用sqllite初始化过可使用 airflow resetdb

重启 scheduler 即可(默认并发调度16), 启动webserver:

airflow scheduler
// 启动 webserver
airflow webserver -p 8080
// 重启
airflow webserver -p 8080 -D

四、可能遇到的问题

初始化 mysql 数据库时 ModuleNotFoundError: No module named 'MySQLdb'

file

解决:

yum install mysql-devel gcc gcc-devel python-devel
# pip3 install mysql-python
# pip3 install mysqlclient
# pip install mysqlclient
pip install mysql-connector-python==8.0.22

file

然后执行 airflow db init

[root@quant airflow]# airflow db init

在初始化数据库的时候,报了这样一个错误:
file

 File "/root/anaconda3/lib/python3.7/site-packages/airflow/migrations/versions/0e2a74e0fc9f_add_time_zone_awareness.py", line 44, in upgrade
    raise Exception("Global variable explicit_defaults_for_timestamp needs to be on (1) for mysql")
Exception: Global variable explicit_defaults_for_timestamp needs to be on (1) for mysql

解决方法:
进入mysql airflow 数据库,设置 global explicit_defaults_for_timestamp

mysql>  show global variables like '%timestamp%';
+---------------------------------+-------+
| Variable_name                   | Value |
+---------------------------------+-------+
| explicit_defaults_for_timestamp | OFF   |
| log_timestamps                  | UTC   |
+---------------------------------+-------+
2 rows in set (0.00 sec)

mysql> set global explicit_defaults_for_timestamp =1;
Query OK, 0 rows affected (0.00 sec)

mysql>

查看数据库表:

file

由于上边在初始化数据库时报错了,一部分表没有生成,所以,需要再次重新执行数据库初始化:

[root@quant airflow]# airflow db init

最后查看 airflow 初始化的数据库:
file

不过,在初始化最后的时候还是报了错误:

 s = self.string_literal(o.encode(self.encoding))
  File "/root/anaconda3/lib/python3.7/encodings/cp1252.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 512-518: character maps to <undefined>
[root@quant airflow]# 

使用mysql作为airflow的数据库,在airflow.cfg中设置字符编码为utf-8,在初始化airflow数据库时还是会碰到如下错误:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>

这种还是在编码设置问题上,处理需要在airflow.cfg中设置编码 sql_engine_encoding = utf-8 还需要在 sql_alchemy_conn 中加上编码设置:

sql_alchemy_conn = mysql://{db_name}:{db_pwd}@{host}:{port}/airflow?charset=utf8

第二个是在命令行创建web用户的时候报的错

Table 'airflow.ab_permission_view_role' doesn't exist

只需要在配置中的 webserver 项下设置rbac: rbac = True ,再对数据库做一次初始化就可以

原文链接:https://blog.csdn.net/xufwind/article/details/113877019

file

可以看到,这次是初始化成功了。

五、创建用户

airflow users create \
    --username admin \
    --firstname Corwien \
    --lastname Wong \
    --role Admin \
    --email 407544577@qq.com

创建的用户密码为:quant

可以在数据库user表中看到已经创建成功:
file

六、重启其他服务

查看已经运行的进程:

[root@quant airflow]# ps aux | grep 'airflow'
root     21255  0.4  1.5 280212 61408 ?        S    08:08   1:05 airflow scheduler -- DagFileProcessorManager
root     28387  0.8  1.2 255256 51548 ?        S    Sep04  12:44 gunicorn: master [airflow-webserver]
root     28449 32.2  1.6 280468 67240 ?        R    Sep04 481:04 /root/anaconda3/bin/python /root/anaconda3/bin/airflow scheduler
root     28452  0.0  1.3 276024 56012 ?        S    Sep04   0:10 airflow serve-logs
root     32102  108  1.9 487676 80664 ?        Rl   12:26   0:03 gunicorn: worker [airflow-webserver]
root     32103  144  1.9 487676 80600 ?        Rl   12:26   0:02 gunicorn: worker [airflow-webserver]
root     32117  0.0  1.3 266740 53428 ?        R    12:26   0:00 gunicorn: worker [airflow-webserver]
root     32118  0.0  1.2 262168 48560 ?        R    12:26   0:00 gunicorn: worker [airflow-webserver]
root     32120  0.0  0.0 112828  2340 pts/0    R+   12:26   0:00 grep --color=auto airflow

批量生进程:

ps -ef | grep airflow | grep -v grep | awk '{print "kill -9 "$2}'  | sh

删除相关pid进程文件:
file

[root@quant airflow]# rm -rf airflow-webserver-monitor.pid
[root@quant airflow]# rm -rf airflow-scheduler.*
[root@quant airflow]# rm -rf airflow-webserver.*

file

重启任务

# 启动web服务
airflow webserver -p 8080 -D

# 启动定时任务
airflow scheduler -D

可以看到已经启动成功了。

file

file

为者常成,行者常至