出于对性能的要求,公司希望把Mysql的数据迁移到MongoDB.aspx' >MongoDB上,于是我开始学习Mongo的一些CRUD操作,由于第一次接触NoSQL,还是有点不习惯。
先吐个槽,公司的Mongo版本是2.6.4,而用的java驱动包版本是超级老物2.4版。当时一个“如何对分组后的文档进行筛选”这个需求头痛了很久,虽然shell命令下可以使用Aggregation很方便地解决,但是java驱动包从2.9.0版本才开始支持该特性,我至今没有找到不用Aggregation解决上述需求的办法。只能推荐公司升级驱动包版本,希望没有后续的兼容问题。
Mongo2.2版本后开始支持Aggregation Pipeline,而java驱动包从2.9.0版本才开始支持2.2的特性,2.9版本是12年发布的,mongodb在09年就出现了,可见Mongo对java的开发者似乎不怎么友好←_←
话扯到这里,接下来就对我这一周所学的Mongo做一个总结,错误之处还望指教 :-D。
MongoDB目前提供了三个可以执行聚合操作的命令:aggregate、mapReduce、group。三者在性能和操作的优劣比较见官网提供的表格 Aggregation Commands Comparison,这里不再赘述细节,先说一下这三个函数的原型及底层封装的命令。
函数名 | 函数原型 | 封装的命令 |
db.collection.group() | db.collection.group( { key, reduce, initial [, keyf] [, cond] [, finalize] } ) | db.runCommand( { group: { ns: <namespace>, key: <key>, $reduce: <reduce function>, $keyf: <key function>, cond: <query>, finalize: <finalize function> } } ) |
db.collection.mapReduce() | db.collection.mapReduce( <map>, <reduce>, { out: <collection>, query: <document>, sort: <document>, limit: <number>, finalize: <function>, scope: <document>, jsMode: <boolean>, verbose: <boolean> } ) | db.runCommand( { mapReduce: <collection>, map: <function>, reduce: <function>, finalize: <function>, out: <output>, query: <document>, sort: <document>, limit: <number>, scope: <document>, jsMode: <boolean>, verbose: <boolean> } ) |
db.collection.aggregate() | db.collection.aggregate( pipeline, options ) | db.runCommand( { aggregate: "<collection>", pipeline: [ <stage>, <...> ], explain: <boolean>, allowDiskUse: <boolean>, cursor: <document> } ) |
好记性不如烂笔头,下面通过操作来了解这几个函数和命令:
先准备SQL的测试数据,用来验证结果、比较SQL语句和NoSQL的异同,测试环境是mysql
先创建数据库表
create table dogroup ( _id int, name varchar(45), course varchar(45), score int, gender int, primary key(_id));
插入数据
1 insert into dogroup (_id, name, course, score, gender) values (1, "N", "C", 5, 0); 2 insert into dogroup (_id, name, course, score, gender) values (2, "N", "O", 4, 0); 3 insert into dogroup (_id, name, course, score, gender) values (3, "A", "C", 5, 1); 4 insert into dogroup (_id, name, course, score, gender) values (4, "A", "O", 6, 1); 5 insert into dogroup (_id, name, course, score, gender) values (5, "A", "U", 8, 1); 6 insert into dogroup (_id, name, course, score, gender) values (6, "A", "R", 8, 1); 7 insert into dogroup (_id, name, course, score, gender) values (7, "A", "S", 7, 1); 8 insert into dogroup (_id, name, course, score, gender) values (8, "M", "C", 4, 0); 9 insert into dogroup (_id, name, course, score, gender) values (9, "M", "U", 7, 0);10 insert into dogroup (_id, name, course, score, gender) values (10, "E", "C", 7, 1);
准备MongoDB测试数据
创建Collection(等同于SQL中的表,该行可以不写,Mongo会在插入数据时自动创建Collection)
1 db.createCollection("dogroup")
插入数据
1 db.dogroup.insert({"_id": 1,"name": "N",course: "C","score": 5,gender: 0}) 2 db.dogroup.insert({"_id": 2,"name": "N",course: "O","score": 4,gender: 0}) 3 db.dogroup.insert({"_id": 3,"name": "A",course: "C","score": 5,gender: 1}) 4 db.dogroup.insert({"_id": 4,"name": "A",course: "O","score": 6,gender: 1}) 5 db.dogroup.insert({"_id": 5,"name": "A",course: "U","score": 8,gender: 1}) 6 db.dogroup.insert({"_id": 6,"name": "A",course: "R","score": 8,gender: 1}) 7 db.dogroup.insert({"_id": 7,"name": "A",course: "S","score": 7,gender: 1}) 8 db.dogroup.insert({"_id": 8,"name": "M",course: "C","score": 4,gender: 0}) 9 db.dogroup.insert({"_id": 9,"name": "M",course: "U","score": 7,gender: 0})10 db.dogroup.insert({"_id": 10,"name": "E",course: "C","score": 7,gender: 1})
以下操作可能逻辑上没有实际意义,主要是帮助熟悉指令
1、查询出共有几门课程(course),返回的格式为“课程名、数量”
SQL写法:select course as '课程名', count(*) as '数量' from dogroup group by course;
MongoDB写法:
① group方式
1 db.dogroup.group({ 2 key : { course: 1 }, 3 initial : { count: 0 }, 4 reduce : function Reduce(curr, result) { 5 result.count += 1; 6 }, 7 finalize : function Finalize(out) { 8 return {"课程名": out.course, "数量": out.count}; 9 }10 });
返回的格式如下:
1 { 2 "课程名" : "C", 3 "数量" : 4 4 }, 5 { 6 "课程名" : "O", 7 "数量" : 2 8 }, 9 {10 "课程名" : "U",11 "数量" : 212 },13 {14 "课程名" : "R",15 "数量" : 116 },17 {18 "课程名" : "S",19 "数量" : 120 }
View Code
② mapReduce方式
1 db.dogroup.mapReduce( 2 function () { 3 emit( 4 this.course, 5 {course: this.course, count: 1} 6 ); 7 }, 8 function (key, values) { 9 var count = 0;10 values.forEach(function(val) {11 count += val.count;12 });13 return {course: key, count: count};14 },15 {16 out: { inline : 1 },17 finalize: function (key, reduced) {18 return {"课程名": reduced.course, "数量": reduced.count};19 }20 }21 )
这里把count初始化为1的原因是,MongoDB执行完map函数(第一个函数)后,如果key所对应的values数组的元素个数只有一个,reduce函数(第二个函数)将不会被调用。
返回的格式如下:
1 { 2 "_id" : "C", 3 "value" : { 4 "课程名" : "C", 5 "数量" : 4 6 } 7 }, 8 { 9 "_id" : "O",10 "value" : {11 "课程名" : "O",12 "数量" : 213 }14 },15 {16 "_id" : "R",17 "value" : {18 "课程名" : "R",19 "数量" : 120 }21 },22 {23 "_id" : "S",24 "value" : {25 "课程名" : "S",26 "数量" : 127 }28 },29 {30 "_id" : "U",31 "value" : {32 "课程名" : "U",33 "数量" : 234 }35 }
View Code
③ aggregate方式
1 db.dogroup.aggregate(2 {3 $group:4 {5 _id: "$course",6 count: { $sum: 1 }7 }8 }9 )
返回格式如下:
{ "_id" : "S", "count" : 1 }{ "_id" : "R", "count" : 1 }{ "_id" : "U", "count" : 2 }{ "_id" : "O", "count" : 2 }{ "_id" : "C", "count" : 4 }
View Code
以上三种方式中,group得到了我们想要的结果,mapReduce返回的结果只能嵌套在values里面,aggregate无法为返回结果指定别名。本人才疏学浅,刚接触Mongo,不知道后两者有没有可行的方法获取想要的结果,希望网友指教。
篇幅限制,下面的查询不再贴出返回结果,大家可以自己尝试。
2、查询Docouments(等同于SQL中记录)数大于2的课程
SQL写法:select course, count(*) as count from dogroup group by course having count > 2;
① aggregate方式(注意$group和$match的先后顺序)
1 db.dogroup.aggregate({ 2 $group: { 3 _id: "$course", 4 count: { $sum: 1 } 5 } 6 },{ 7 $match: { 8 count:{ 9 $gt: 210 }11 }12 });
目前尚未找到group和mapReduce对分组结果进行筛选的方法,欢迎网友补充
3、找出所有分数高于5分的考生数量及分数,返回的格式为“分数、数量”
SQL写法:select score as '分数', count(distinct(name)) as '数量' from dogroup where score > 5 group by score;
① group方式
1 db.dogroup.group({ 2 key : { score: 1 }, 3 cond : { score: {$gt: 5} }, 4 initial : { name:[] }, 5 reduce : function Reduce(curr, result) { 6 var flag = true; 7 for(i=0;i<result.name.length&&flag;i++){ 8 if(curr.name==result.name[i]){ 9 flag = false;10 }11 }12 if(flag){13 result.name.push(curr.name);14 }15 },16 finalize : function Finalize(out) {17 return {"分数": out.score, "数量": out.name.length};18 }19 });
② mapReduce方式
1 db.dogroup.mapReduce( 2 function () { 3 if(this.score > 5){ 4 emit( 5 this.score, 6 {score: this.score, name: this.name} 7 ); 8 } 9 },10 function (key, values) {11 var reduced = {score: key, names: []};12 var json = {};//利用json对象的key去重13 for(i = 0; i < values.length; i++){14 if(!json[values[i].name]){15 reduced.names.push(values[i].name);16 json[values[i].name] = 1;17 }18 }19 return reduced;20 },21 {22 out: { inline : 1 },23 finalize: function (key, reduced) {24 return {"分数": reduced.score, "数量": reduced.names?reduced.names.length:1};25 }26 }27 )
③ aggregate方式
1 db.dogroup.aggregate({ 2 $match: { 3 score: { 4 $gt: 5 5 } 6 } 7 },{ 8 $group: { 9 _id: {10 score: "$score",11 name: "$name"12 }13 }14 },{15 $group: {16 _id: {17 "分数": "$_id.score"18 },19 "数量": { $sum: 1 }20 }21 });
弄熟上面这几个方法,大部分的分组应用场景应该没大问题了。
原标题:MongoDB聚合查询
关键词:MongoDB