您可以使用聚合框架在 MongoDB 中查找重复记录。为了理解这个概念,让我们用文档创建一个集合。使用文档创建集合的查询如下 -
> db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"John"}); { "acknowledged" : true, "insertedId" : ObjectId("5c8a330293b406bd3df60e01") } > db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"John"}); { "acknowledged" : true, "insertedId" : ObjectId("5c8a330493b406bd3df60e02") } > db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"Carol"}); { "acknowledged" : true, "insertedId" : ObjectId("5c8a330c93b406bd3df60e03") } > db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"Sam"}); { "acknowledged" : true, "insertedId" : ObjectId("5c8a331093b406bd3df60e04") } > db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"Carol"}); { "acknowledged" : true, "insertedId" : ObjectId("5c8a331593b406bd3df60e05") } > db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"Mike"}); { "acknowledged" : true, "insertedId" : ObjectId("5c8a331e93b406bd3df60e06") }
在find()方法的帮助下显示集合中的所有文档。查询如下 -
> db.findDuplicateRecordsDemo.find();
以下是输出 -
{ "_id" : ObjectId("5c8a330293b406bd3df60e01"), "StudentFirstName" : "John" } { "_id" : ObjectId("5c8a330493b406bd3df60e02"), "StudentFirstName" : "John" } { "_id" : ObjectId("5c8a330c93b406bd3df60e03"), "StudentFirstName" : "Carol" } { "_id" : ObjectId("5c8a331093b406bd3df60e04"), "StudentFirstName" : "Sam" } { "_id" : ObjectId("5c8a331593b406bd3df60e05"), "StudentFirstName" : "Carol" } { "_id" : ObjectId("5c8a331e93b406bd3df60e06"), "StudentFirstName" : "Mike" }
这是在 MongoDB 中查找重复记录的查询 -
> db.findDuplicateRecordsDemo.aggregate( ... {"$group" : { "_id": "$StudentFirstName", "count": { "$sum": 1 } } }, ... {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, ... {"$project": {"StudentFirstName" : "$_id", "_id" : 0} } ... );
以下是仅显示重复记录的输出 -
{ "StudentFirstName" : "Carol" } { "StudentFirstName" : "John" }