EN VI

Arrays - Find duplicates in array, print count with pair?

How to Arrays - Find duplicates in array, print count with pair

I have an array of value,location pairs

arr=(test,meta my,amazon test,amazon this,meta test,google my,google hello,microsoft)

I want to print the duplicate values, the number/count of them, along with the location.

For example:

3 test: meta, amazon, google
2 my: amazon, google
1 this: meta
1 hello: microsoft

Here test appears 3 times, in meta, amazon, and google

So far, this code will print the item and location

printf '%s\n' "${arr[@]}" | awk -F"," '!_[$1]++'

test,meta
my,amazon
this,meta
hello,microsoft

This will print the count, but it's taking in the value,location as one value

printf '%s\n' "${arr[@]}" | sort | uniq -c | sort -r

   1 my,amazon
   1 my,google
   1 this,meta
   1 test,meta
   1 test,google
   1 test,amazon
   1 hello,microsoft

Solution:

You may consider this solution that would with any version of awk:

printf '%s\n' "${arr[@]}" |
awk -F, '
{
   for(i=1; i<NF; ++i) {
      row[$i] = ($i in fq ? row[$i] ", " : "") $NF
      ++fq[$i]
   }
}
END {
   for (k in fq) print fq[k], k ":", row[k]
}' | sort -rn -k1

3 test: meta, amazon, google
2 my: amazon, google
1 this: meta
1 hello: microsoft

Note that, I have used sort to get output as per your shown expected output. If you don't care about ordering that you can remove sort command.