I have a source collection with the following documents indexed on the first 4 fields.
[{state: 'NY', city: 'New York', zip: '10000', store: '1234', item: '1234', size: 'L'},
{state: 'NY', city: 'New York', zip: '10000', store: '1234', item: '1234', size: 'L'},
{state: 'NY', city: 'New York', zip: '10100', store: '1234', item: '1234', size: 'L'},
{state: 'NY', city: 'New York', zip: '10100', store: '1234', item: '1234', size: 'L'},
{state: 'NJ', city: 'Newark', zip: '08800', store: '2345', item: '2345', size: 'M'},
{state: 'NJ', city: 'Newark', zip: '08800', store: '2345', item: '2345', size: 'M'},
{state: 'NJ', city: 'Newark', zip: '08810', store: '2345', item: '2345', size: 'M'},
{state: 'NJ', city: 'Newark', zip: '08810', store: '2345', item: '2345', size: 'M'}]
I’d like to copy the distinct documents from my source collection (based on the first 4 fields) over to another collection. The new collection should have the documents below. My source collection is huge and so performance will be an important consideration on how to effect this move.
[{state: 'NY', city: 'New York', zip: '10000', store: '1234', item: '1234', size: 'L'},
{state: 'NY', city: 'New York', zip: '10100', store: '1234', item: '1234', size: 'L'},
{state: 'NJ', city: 'Newark', zip: '08800', store: '2345', item: '2345', size: 'M'},
{state: 'NJ', city: 'Newark', zip: '08810', store: '2345', item: '2345', size: 'M'}]
I tried the aggregate pipeline but could only get to listing the distinct values for the documents but not the entire document.
[{state: 'NY', city: 'New York', zip: '10000', store: '1234'},
{state: 'NY', city: 'New York', zip: '10100', store: '1234'},
{state: 'NJ', city: 'Newark', zip: '08800', store: '2345'},
{state: 'NJ', city: 'Newark', zip: '08810', store: '2345'}]
2
Answers
Use
$first
with Root to get the first (or last) document in the grouping criteria:Note
$out
will replace the specified collection if it exists.Mongo Playground
Resulting docs:
You can use
$setWindowFields
to compute$rank
within the partition (i.e. the first 4 fields). Then select those with rank = 1 and$merge
to output to another collection.Mongo Playground