Rust.cc

Rust.cc -

关于match/if-else以及表驱动的一些测试

在流程控制上,rust历来都比较推荐match,功能上十分完备,可以很大程度上替代传统语言的if-else语句,也能很大程度替代其他语言在表驱动的写法。 说是这么说,但实际性能是怎样的却很少有人说明,只看汇编码的话勉强可以看出match语句比if-else语句少掉很多比较,但实际性能怎样还得测一测。 测试的方法则是一个通过流程控制实现的循环计数功能,分别测试纯粹的match匹配,花式match-if匹配,传统if-else流程控制,以及通过HashMap<usize, Box<dyn Fn(&mut usize)>来实现表驱动。 其中表驱动这种写法很多语言是用switch实现的,rust一般是用match实现的,但也有部分语言方法是hashmap实现的,rust讲实话很少实现以函数为值的容器,其中涉及的所有权问题比较复杂,虽然是函数但还得包动态类型智能指针,不像某些语言直接把函数放进值里就可以了。这次测试也只是用了闭包Fn实现,本来想写更加不卫生的无参数FnMut,但怎么都编译不通过,只能凑合测测看了。 具体在实现的代码如下,假设其为64个分支: fn match_test(n: usize, mut x: usize) -> usize { for _ in 0..n { match x { 1 => x = 2, 2 => x = 3, ... 63 => x = 64, _ => x = 1 } } x } fn match_if_test(n: usize, mut x: usize) -> usize { for _ in 0..n { match x { y if y == 1 => x = 2, y if y == 2 => x = 3, ... y if y == 63 => x = 64, _ => x = 1 } } x } fn if_else_test(n: usize, mut x: usize) -> usize { for _ in 0..n { if x == 1 { x = 2 } else if x == 2 { x = 3 } ... else if x == 63 { x = 64 } else { x = 1 } } x } fn table_driven_test(n: usize, mut x: usize) -> usize { use std::collections::HashMap; // 本来想写更加不卫生的无参数FnMut,但最终也没把x的所有权问题解决 type Callback = Box<dyn Fn(&mut usize)>; let m: HashMap<usize, Callback> = HashMap::from([ (1, Box::new(|x: &mut usize| *x = 2) as Callback), (2, Box::new(|x: &mut usize| *x = 3) as Callback), ... (63, Box::new(|x: &mut usize| *x = 64) as Callback), ]); let default = |x: &mut usize| *x = 1; for _ in 0..n { if m.get(&x).map(|func| func(&mut x)).is_none() { // 当 x 不等于 1 到 63 的时候,x = 1; default(&mut x); } } x } 通过宏重写了复用的逻辑,分别测试不同的分枝数的时间,循环次数取了n = 10_000_000,最后得到如下测试结果: branches: 4 match: 4.3417ms, 4.2835ms, 3.8984ms, 3.7605ms, 3.4414ms, 4.1675ms, 3.7215ms, 3.6409ms, 3.7537ms, 4.1638ms, match_if: 3.7365ms, 3.6575ms, 4.5344ms, 3.7472ms, 3.5969ms, 3.5903ms, 4.2189ms, 4.0093ms, 3.5712ms, 3.5206ms, if_else: 4.4605ms, 4.3242ms, 3.5882ms, 4.1222ms, 4.9582ms, 4.4ms, 3.5924ms, 4.166ms, 3.8549ms, 3.6634ms, table_driven: 104.0682ms, 129.4536ms, 101.5466ms, 101.8384ms, 100.7811ms, 165.6563ms, 172.9952ms, 124.9258ms, 120.3115ms, 110.1817ms, branches: 8 match: 3.2001ms, 3.0407ms, 3.1535ms, 3.1043ms, 2.9263ms, 3.0348ms, 2.9395ms, 2.9264ms, 3.3435ms, 2.9902ms, match_if: 3.1042ms, 3.0786ms, 2.9634ms, 3.0259ms, 3.2047ms, 3.0612ms, 3.0388ms, 2.9657ms, 2.986ms, 3.0801ms, if_else: 3.4446ms, 3.8674ms, 3.7341ms, 2.9381ms, 3.0707ms, 4.0342ms, 2.9509ms, 2.9892ms, 2.986ms, 3.1982ms, table_driven: 106.7574ms, 105.5663ms, 102.2015ms, 104.8766ms, 101.0446ms, 100.0284ms, 107.1194ms, 112.0209ms, 119.2932ms, 122.7158ms, branches: 16 match: 4.27ms, 4.4616ms, 3.9281ms, 4.1828ms, 3.8632ms, 3.8148ms, 3.8967ms, 4.0776ms, 3.9938ms, 4.0849ms, match_if: 4.0625ms, 3.9611ms, 3.9091ms, 4.0129ms, 3.9441ms, 3.932ms, 3.8967ms, 3.9504ms, 3.9825ms, 3.9186ms, if_else: 4.3945ms, 4.0326ms, 3.895ms, 3.932ms, 4.0213ms, 4.002ms, 3.9321ms, 3.8942ms, 3.9528ms, 3.9736ms, table_driven: 110.0617ms, 108.8107ms, 104.6565ms, 104.4869ms, 102.2463ms, 103.9144ms, 101.618ms, 104.2009ms, 104.4785ms, 105.3564ms, branches: 32 match: 3.9087ms, 3.9974ms, 3.7856ms, 4.4178ms, 4.0035ms, 3.8378ms, 3.8372ms, 4.314ms, 4.1929ms, 3.8094ms, match_if: 3.9669ms, 4.1839ms, 3.9322ms, 3.9001ms, 3.9521ms, 4.0726ms, 3.9288ms, 3.9003ms, 3.8927ms, 4.034ms, if_else: 4.2824ms, 3.988ms, 4.1618ms, 4.0035ms, 3.9509ms, 3.9636ms, 4.1215ms, 3.8909ms, 3.9719ms, 3.9449ms, table_driven: 109.9416ms, 105.4346ms, 101.2063ms, 106.2021ms, 146.3973ms, 197.2266ms, 200.199ms, 136.0667ms, 102.3032ms, 103.1737ms, branches: 64 match: 4.1269ms, 4.7294ms, 3.9456ms, 3.9763ms, 4.1859ms, 4.3841ms, 3.9903ms, 3.9992ms, 4.0657ms, 4.2758ms, match_if: 4.0881ms, 4.3955ms, 4.2702ms, 4.5231ms, 3.9956ms, 3.9856ms, 4.3953ms, 3.9909ms, 3.9937ms, 3.9962ms, if_else: 4.5749ms, 4.1437ms, 4.0957ms, 4.0846ms, 4.1792ms, 4.0903ms, 4.0876ms, 11.1614ms, 13.2386ms, 10.295ms, table_driven: 172.1502ms, 165.3664ms, 181.8855ms, 201.7172ms, 158.34ms, 119.9631ms, 125.6167ms, 197.366ms, 206.2566ms, 122.1135ms, branches: 128 match: 4.3008ms, 4.1509ms, 5.3559ms, 4.2214ms, 4.1455ms, 4.4872ms, 4.6999ms, 4.2647ms, 4.251ms, 5.2885ms, match_if: 6.6349ms, 6.2517ms, 6.5265ms, 6.0881ms, 6.6553ms, 6.4367ms, 7.0819ms, 6.3207ms, 6.0893ms, 6.3231ms, if_else: 6.552ms, 6.1662ms, 6.4211ms, 6.2439ms, 6.3045ms, 6.3368ms, 6.467ms, 6.4995ms, 6.2427ms, 6.2635ms, table_driven: 220.7586ms, 212.7129ms, 211.9967ms, 215.7127ms, 216.353ms, 219.4874ms, 212.6762ms, 213.5031ms, 212.8011ms, 218.6281ms, branches: 1024 match: 3.593ms, 4.6327ms, 3.5263ms, 3.4625ms, 4.235ms, 3.9134ms, 3.5307ms, 3.5078ms, 3.516ms, 3.7592ms, match_if: 18.281ms, 14.9784ms, 14.9011ms, 14.8372ms, 15.064ms, 14.9355ms, 15.0232ms, 15.4237ms, 15.177ms, 16.1911ms, if_else: 16.596ms, 19.3573ms, 17.9486ms, 15.5792ms, 19.8704ms, 15.1197ms, 15.1438ms, 15.2105ms, 15.4105ms, 15.515ms, table_driven: 225.3498ms, 231.7748ms, 240.5668ms, 219.5685ms, 220.2287ms, 213.2436ms, 228.3222ms, 233.3159ms, 245.781ms, 241.3446ms, 可以看出,在分支比较少的情况下,即64个分支以内的情况下,通过编译器的优化,match/match-if/if-else在性能上没啥区别,而表驱动走的数值哈希,在不扩容的情况下理论上性能也不会有太大变化,但都比前三者差上15-20倍。 在分支变大到1024的时候,可以明显看出match的性能优势,可以说几乎没有减弱,而match-if和if-else都同时出现了性能下降,比小分支时候的性能差了3-5倍,至于hashmap的表驱动,性能也再下降了一倍。 再往上翻倍分支我这就是栈溢出,所以最多就测到了1024。 当然,在绝大多数情况下,我们都用不到这么多分支,我在其他语言遇到过最大if-else屎山大约也就中几十个分支,就算写了注释也看不懂,最后选择在屎山上再上一坨。 对于rust项目来说,以后遇到可能出现屎山的情况,如果是为了匹配清晰还是优先选match比较好,就像各种教程推荐的那样优先用match。至于hashmap那种表驱动写法,又难写,性能又不行,除非有必须使用的场景(可能是某些路由匹配),在日常分支判断里实在是不太推荐(指的是在某些语言里用表驱动实现分支)。不过话说回来,hashmap更动态,更适合基础设施和框架,可以根据配置文件来动态插入,不像match和if-else只能修改代码写死,使用场景还是略有区别。

相关推荐 去reddit讨论
观测云
观测云
Dify.AI
Dify.AI
LigaAI
LigaAI
eolink
eolink

推荐或自荐