EN VI

Arrays - Find duplicates in array, print count with pair?

2024-03-15 22:30:06
How to Arrays - Find duplicates in array, print count with pair

I have an array of value,location pairs

arr=(test,meta my,amazon test,amazon this,meta test,google my,google hello,microsoft)

I want to print the duplicate values, the number/count of them, along with the location.

For example:

3 test: meta, amazon, google
2 my: amazon, google
1 this: meta
1 hello: microsoft

Here test appears 3 times, in meta, amazon, and google

So far, this code will print the item and location

printf '%s\n' "${arr[@]}" | awk -F"," '!_[$1]++'
test,meta
my,amazon
this,meta
hello,microsoft

This will print the count, but it's taking in the value,location as one value

printf '%s\n' "${arr[@]}" | sort | uniq -c | sort -r
   1 my,amazon
   1 my,google
   1 this,meta
   1 test,meta
   1 test,google
   1 test,amazon
   1 hello,microsoft

Solution:

You may consider this solution that would with any version of awk:

printf '%s\n' "${arr[@]}" |
awk -F, '
{
   for(i=1; i<NF; ++i) {
      row[$i] = ($i in fq ? row[$i] ", " : "") $NF
      ++fq[$i]
   }
}
END {
   for (k in fq) print fq[k], k ":", row[k]
}' | sort -rn -k1

3 test: meta, amazon, google
2 my: amazon, google
1 this: meta
1 hello: microsoft

Note that, I have used sort to get output as per your shown expected output. If you don't care about ordering that you can remove sort command.

Answer

Login


Forgot Your Password?

Create Account


Lost your password? Please enter your email address. You will receive a link to create a new password.

Reset Password

Back to login