谷粒商城-集群-90-分库分表利器 ShardingSphere
一、ShardingSphere简介
ShardingSphere是一套开源的分布式数据库中间件解决方案组成的生态圈,它由Sharding-JDBC、Sharding-Proxy和Sharding-Sidecar(计划中)这3款相互独立的产品组成。 他们均提供标准化的数据分片、分布式事务和数据库治理功能,可适用于如Java同构、异构语言、云原生等各种多样化的应用场景。
二、Sharding-Proxy
定位为透明化的数据库代理端,提供封装了数据库二进制协议的服务端版本,用于完成对异构语言的支持。
三、数据分片
1、核心概念
逻辑表
水平拆分的数据库(表)的相同逻辑和数据结构表的总称。例:订单数据根据主键尾数拆分为10张表,分别是t_order_0到t_order_9,他们的逻辑表名为t_order
。
真实表
在分片的数据库中真实存在的物理表。即上个示例中的t_order_0到t_order_9。
数据节点
数据分片的最小单元。由数据源名称和数据表组成,例:ds_0.t_order_0。
绑定表
指分片规则一致的主表和子表。例如:t_order表和t_order_item表,均按照order_id分片,则此两张表互为绑定表关系。绑定表之间的多表关联查询不会出现笛卡尔积关联,关联查询效率将大大提升。举例说明,如果SQL为:
SELECT i.* FROM t_order o JOIN t_order_item i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
在不配置绑定表关系时,假设分片键order_id将数值10路由至第0片,将数值11路由至第1片,那么路由后的SQL应该为4条,它们呈现为笛卡尔积:
SELECT i.* FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_0 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
在配置绑定表关系后,路由的SQL应该为2条:
SELECT i.* FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
其中t_order在FROM的最左侧,ShardingSphere将会以它作为整个绑定表的主表。 所有路由计算将会只使用主表的策略,那么t_order_item表的分片计算将会使用t_order的条件。故绑定表之间的分区键要完全相同。
四、配置
分片规则
分片规则配置的总入口。包含数据源配置
、表配置
、绑定表配置
以及读写分离配置
等。
数据源配置
真实数据源列表。
表配置
逻辑表名称、数据节点与分表规则的配置。
数据节点配置
用于配置逻辑表与真实表的映射关系。可分为均匀分布和自定义分布两种形式。
- 均匀分布
指数据表在每个数据源内呈现均匀分布的态势,例如:
db0
├── t_order0
└── t_order1
db1
├── t_order0
└── t_order1
那么数据节点的配置如下:
db0.t_order0, db0.t_order1, db1.t_order0, db1.t_order1
- 自定义分布
指数据表呈现有特定规则的分布,例如:db0 ├── t_order0 └── t_order1 db1 ├── t_order2 ├── t_order3 └── t_order4
那么数据节点的配置如下:
db0.t_order0, db0.t_order1, db1.t_order2, db1.t_order3, db1.t_order4
分片策略配置
对于分片策略存有数据源分片策略和表分片策略两种维度。
-
数据源分片策略
对应于DatabaseShardingStrategy。用于配置数据被分配的目标数据源。 -
表分片策略
对应于TableShardingStrategy。用于配置数据被分配的目标表,该目标表存在与该数据的目标数据源内。故表分片策略是依赖与数据源分片策略的结果的。
两种策略的API完全相同。
自增主键生成策略
通过在客户端生成自增主键替换以数据库原生自增主键的方式,做到分布式主键无重复。
配置手册
数据分片 + 读写分离
schemaName: sharding_master_slave_db
dataSources:
ds0:
url: jdbc:postgresql://localhost:5432/ds0
username: root
password:
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 65
ds0_slave0:
url: jdbc:postgresql://localhost:5432/ds0_slave0
username: root
password:
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 65
ds0_slave1:
url: jdbc:postgresql://localhost:5432/ds0_slave1
username: root
password:
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 65
ds1:
url: jdbc:postgresql://localhost:5432/ds1
username: root
password:
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 65
ds1_slave0:
url: jdbc:postgresql://localhost:5432/ds1_slave0
username: root
password:
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 65
ds1_slave1:
url: jdbc:postgresql://localhost:5432/ds1_slave1
username: root
password:
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 65
shardingRule:
tables:
t_order:
actualDataNodes: ms_ds${0..1}.t_order${0..1}
databaseStrategy:
inline:
shardingColumn: user_id
algorithmExpression: ms_ds${user_id % 2}
tableStrategy:
inline:
shardingColumn: order_id
algorithmExpression: t_order${order_id % 2}
keyGenerator:
type: SNOWFLAKE
column: order_id
t_order_item:
actualDataNodes: ms_ds${0..1}.t_order_item${0..1}
databaseStrategy:
inline:
shardingColumn: user_id
algorithmExpression: ms_ds${user_id % 2}
tableStrategy:
inline:
shardingColumn: order_id
algorithmExpression: t_order_item${order_id % 2}
keyGenerator:
type: SNOWFLAKE
column: order_item_id
bindingTables:
- t_order,t_order_item
broadcastTables:
- t_config
defaultDataSourceName: ds0
defaultTableStrategy:
none:
masterSlaveRules:
ms_ds0:
masterDataSourceName: ds0
slaveDataSourceNames:
- ds0_slave0
- ds0_slave1
loadBalanceAlgorithmType: ROUND_ROBIN
ms_ds1:
masterDataSourceName: ds1
slaveDataSourceNames:
- ds1_slave0
- ds1_slave1
loadBalanceAlgorithmType: ROUND_ROBIN
五、实战
1、下载Sharding-Proxy
2、引入依赖
如果后端连接MySQL数据库,需要下载MySQL Connector/J, 或者使用国内的镜像源sohu MySQL Connector/J解压缩后,将mysql-connector-java-5.1.47.jar拷贝到${sharding-proxy}\lib目录。
将下载的jar包拷贝到上边下载的Sharding-Proxy lib目录下:
3、规则配置
编辑%SHARDING_PROXY_HOME%\conf\server.yaml。详情请参见配置手册。
Sharding-Proxy使用conf/server.yaml
配置注册中心、认证信息
以及公用属性。
权限验证:
用于执行登录Sharding Proxy的权限验证。配置用户名、密码、可访问的数据库后,必须使用正确的用户名、密码才可登录Proxy。
authentication:
users:
root: # 自定义用户名
password: root # 自定义用户名
sharding: # 自定义用户名
password: sharding # 自定义用户名
authorizedSchemas: sharding_db, masterslave_db # 该用户授权可访问的数据库,多个用逗号分隔。缺省将拥有root权限,可访问全部数据库。
认证信息配置:conf/server.yaml
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
######################################################################################################
#
# If you want to configure orchestration, authorization and proxy properties, please refer to this file.
#
######################################################################################################
#
#orchestration:
# orchestration_ds:
# orchestrationType: registry_center,config_center,distributed_lock_manager
# instanceType: zookeeper
# serverLists: localhost:2181
# namespace: orchestration
# props:
# overwrite: false
# retryIntervalMilliseconds: 500
# timeToLiveSeconds: 60
# maxRetries: 3
# operationTimeoutMilliseconds: 500
#
authentication:
users:
root:
password: root
sharding:
password: sharding
authorizedSchemas: sharding_db
props:
# max.connections.size.per.query: 1
# acceptor.size: 16 # The default value is available processors count * 2.
executor.size: 16 # Infinite by default.
# proxy.frontend.flush.threshold: 128 # The default value is 128.
# # LOCAL: Proxy will run with LOCAL transaction.
# # XA: Proxy will run with XA transaction.
# # BASE: Proxy will run with B.A.S.E transaction.
# proxy.transaction.type: LOCAL
# proxy.opentracing.enabled: false
# proxy.hint.enabled: false
# query.with.cipher.column: true
sql.show: true
# allow.range.query.with.inline.sharding: false
编辑%SHARDING_PROXY_HOME%\conf\config-xxx.yaml。详情请参见配置手册。
分库分表配置:conf/config-sharding.yaml
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
######################################################################################################
#
# Here you can configure the rules for the proxy.
# This example is configuration of sharding rule.
#
# If you want to use sharding, please refer to this file;
# if you want to use master-slave, please refer to the config-master_slave.yaml.
#
######################################################################################################
#
#schemaName: sharding_db
#
#dataSources:
# ds_0:
# url: jdbc:postgresql://127.0.0.1:5432/demo_ds_0?serverTimezone=UTC&useSSL=false
# username: postgres
# password: postgres
# connectionTimeoutMilliseconds: 30000
# idleTimeoutMilliseconds: 60000
# maxLifetimeMilliseconds: 1800000
# maxPoolSize: 50
# ds_1:
# url: jdbc:postgresql://127.0.0.1:5432/demo_ds_1?serverTimezone=UTC&useSSL=false
# username: postgres
# password: postgres
# connectionTimeoutMilliseconds: 30000
# idleTimeoutMilliseconds: 60000
# maxLifetimeMilliseconds: 1800000
# maxPoolSize: 50
#
#shardingRule:
# tables:
# t_order:
# actualDataNodes: ds_${0..1}.t_order_${0..1}
# tableStrategy:
# inline:
# shardingColumn: order_id
# algorithmExpression: t_order_${order_id % 2}
# keyGenerator:
# type: SNOWFLAKE
# column: order_id
# t_order_item:
# actualDataNodes: ds_${0..1}.t_order_item_${0..1}
# tableStrategy:
# inline:
# shardingColumn: order_id
# algorithmExpression: t_order_item_${order_id % 2}
# keyGenerator:
# type: SNOWFLAKE
# column: order_item_id
# bindingTables:
# - t_order,t_order_item
# defaultDatabaseStrategy:
# inline:
# shardingColumn: user_id
# algorithmExpression: ds_${user_id % 2}
# defaultTableStrategy:
# none:
######################################################################################################
#
# If you want to connect to MySQL, you should manually copy MySQL driver to lib directory.
#
######################################################################################################
schemaName: sharding_db
dataSources:
ds_0:
url: jdbc:mysql://192.168.10.10:3307/demo_ds_0?serverTimezone=UTC&useSSL=false
username: root
password: root
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
ds_1:
url: jdbc:mysql://192.168.10.10:3307/demo_ds_1?serverTimezone=UTC&useSSL=false
username: root
password: root
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
#
shardingRule:
tables:
t_order:
actualDataNodes: ds_${0..1}.t_order_${0..1}
tableStrategy:
inline:
shardingColumn: order_id
algorithmExpression: t_order_${order_id % 2}
keyGenerator:
type: SNOWFLAKE
column: order_id
t_order_item:
actualDataNodes: ds_${0..1}.t_order_item_${0..1}
tableStrategy:
inline:
shardingColumn: order_id
algorithmExpression: t_order_item_${order_id % 2}
keyGenerator:
type: SNOWFLAKE
column: order_item_id
bindingTables:
- t_order,t_order_item
defaultDatabaseStrategy:
inline:
shardingColumn: user_id
algorithmExpression: ds_${user_id % 2}
defaultTableStrategy:
none:
读写分离配置:
第一个数据库的读写分离配置:conf/config-master_slave.yaml
######################################################################################################
#
# If you want to connect to MySQL, you should manually copy MySQL driver to lib directory.
#
######################################################################################################
########################
#
# 第一个数据库的读写分离
# 读从库,写更新删除在主库
#
########################
schemaName: sharding_db_1
#
dataSources:
master_0_ds:
url: jdbc:mysql://192.168.10.10:3307/demo_ds_0?serverTimezone=UTC&useSSL=false
username: root
password: root
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
slave_ds_0:
url: jdbc:mysql://192.168.10.10:3317/demo_ds_0?serverTimezone=UTC&useSSL=false
username: root
password: root
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
masterSlaveRule:
name: ms_ds
masterDataSourceName: master_0_ds
slaveDataSourceNames:
- slave_ds_0
# - slave_ds_1
loadBalanceAlgorithmType: ROUND_ROBIN
第二个数据库的读写分离配置:conf/config-master_slave_2.yaml
全部配置完之后,然后启动服务:
######################################################################################################
#
# If you want to connect to MySQL, you should manually copy MySQL driver to lib directory.
#
######################################################################################################
########################
#
# 第2个数据库的读写分离
# 读从库,写更新删除在主库
#
########################
schemaName: sharding_db_2
#
dataSources:
master_1_ds:
url: jdbc:mysql://192.168.10.10:3307/demo_ds_1?serverTimezone=UTC&useSSL=false
username: root
password: root
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
slave_ds_1:
url: jdbc:mysql://192.168.10.10:3317/demo_ds_1?serverTimezone=UTC&useSSL=false
username: root
password: root
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
masterSlaveRule:
name: ms_ds
masterDataSourceName: master_1_ds
slaveDataSourceNames:
- slave_ds_1
# - slave_ds_1
loadBalanceAlgorithmType: ROUND_ROBIN
4、修改主从同步配置
1、先停止主库和从库
[root@localhost master]# docker stop mysql-master mysql-slaver-01
mysql-master
mysql-slaver-01
2、修改主库同步库vi /mydata/mysql/master/conf/my.cnf
[client]
default-character-set=utf8
[mysql]
default-character-set=utf8
[mysqld]
init_connect='SET collation_connection = utf8_unicode_ci'
init_connect='SET NAMES utf8'
character-set-server=utf8
collation-server=utf8_unicode_ci
skip-character-set-client-handshake
skip-name-resolve
#master-slaver repication
server_id=1
log-bin=mysql-bin
read-only=0
binlog-do-db=gulimall_ums
binlog-do-db=gulimall_pms
binlog-do-db=gulimall_oms
binlog-do-db=gulimall_wms
binlog-do-db=gulimall_admin
# new add
binlog-do-db=demo_ds_0
binlog-do-db=demo_ds_1
replicate-ignore-db=mysql
replicate-ignore-db=sys
replicate-ignore-db=information_schema
replicate-ignore-db=performance_schema
3、修改从库同步库vi /mydata/mysql/slaver/conf/my.cnf
[client]
default-character-set=utf8
[mysql]
default-character-set=utf8
[mysqld]
init_connect='SET collation_connection = utf8_unicode_ci'
init_connect='SET NAMES utf8'
character-set-server=utf8
collation-server=utf8_unicode_ci
skip-character-set-client-handshake
skip-name-resolve
#master-slaver repication
server_id=2
log-bin=mysql-bin
read-only=1
binlog-do-db=gulimall_ums
binlog-do-db=gulimall_pms
binlog-do-db=gulimall_oms
binlog-do-db=gulimall_wms
binlog-do-db=gulimall_admin
# new add
binlog-do-db=demo_ds_0
binlog-do-db=demo_ds_1
replicate-ignore-db=mysql
replicate-ignore-db=sys
replicate-ignore-db=information_schema
replicate-ignore-db=performance_schema
4、重启主从库
[root@localhost conf]# docker start mysql-master mysql-slaver-01
mysql-master
mysql-slaver-01
5、创建库
手动在master库创建 demo_ds_0, demo_ds_1 测试库,可以看到我们的从库也同步了刚创建的这两个库。
6、启动sharing-proxy
启动sharing-proxy,并且指定端口为 3388
cd /Users/kaiyiwang/Web/sharding-proxy/bin
➜ bin ./start.sh 3388
Starting the Sharding-Proxy ...
The port is 3388
The classpath is /Users/kaiyiwang/Web/sharding-proxy/conf:.:..:/Users/kaiyiwang/Web/sharding-proxy/lib/*:/Users/kaiyiwang/Web/sharding-proxy/lib/*:/Users/kaiyiwang/Web/sharding-proxy/ext-lib/*
Please check the STDOUT file: /Users/kaiyiwang/Web/sharding-proxy/logs/stdout.log
日志打印:
➜ cat /Users/kaiyiwang/Web/sharding-proxy/logs/stdout.log
...
[INFO ] 13:02:02.701 [nioEventLoopGroup-2-1] i.n.handler.logging.LoggingHandler - [id: 0x17cbeb2f] REGISTERED
[INFO ] 13:02:02.703 [nioEventLoopGroup-2-1] i.n.handler.logging.LoggingHandler - [id: 0x17cbeb2f] BIND: 0.0.0.0/0.0.0.0:3388
[INFO ] 13:02:02.707 [nioEventLoopGroup-2-1] i.n.handler.logging.LoggingHandler - [id: 0x17cbeb2f, L:/0.0.0.0:3388] ACTIVE
可以看到sharding-proxy
已经启动成功了。
然后本地连接,可以看到连接成功了:
在这里我们可以看到三个库,其实是我们在上边配置文件时写的名字,schemaName,以后我们只操作 sharding-db 库。
创建测试表:
CREATE TABLE IF NOT EXISTS t_order (order_id BIGINT NOT NULL AUTO_INCREMENT, user_id INT NOT NULL, status VARCHAR(50), PRIMARY KEY (order_id))
CREATE TABLE IF NOT EXISTS t_order_item (order_item_id BIGINT NOT NULL AUTO_INCREMENT, order_id BIGINT NOT NULL, user_id INT NOT NULL, PRIMARY KEY (order_item_id))
INSERT INTO t_order (user_id, status) VALUES (10, 'INIT')
INSERT INTO t_order_item (order_id, user_id) VALUES (%d, 10) # %d,1-10,可以多执行几次
INSERT INTO t_order (user_id, status) VALUES (11, 'INIT')
INSERT INTO t_order_item (order_id, user_id) VALUES (%d, 11) # %d,1-10,可以多执行几次
逻辑表:
逻辑分表:
真实库分表:
我们可以看到,按照订单的用户ID进行分库,再按照订单ID进行分表,sharding-sphere非常强大,不需要修改代码即可完成分库分表操作,并且有对应的逻辑表,非常方便我们整体查询。
小结
sharding-sphere是一天非常强大的分布式数据库中间件解决方法。
有简单易懂的行表达式用于配置数据节点和数据分片算法。
有自己的诸多大杀器,比如强制路由等。
官方文档齐全,实例代码项目case较全,能够在较短时间完成分库分表。
相关文章:
Shardingsphere数据分片核心概念
分库分表利器——sharding-sphere
为者常成,行者常至
自由转载-非商用-非衍生-保持署名(创意共享3.0许可证)