Quick reminder on how to use bash
with multi-line string.
Tag: bash
Today I had to quickly find the most frequent Hashtags on my smallish dataset. After some research I just found a awesome shell tool to manipulate json: jq a json grep+sed+awk tool
With jq everything else was simple, just pipeline a few commands:
$ cat tweets.json | \ jq -r '.entities.hashtags[].text' | sort | uniq -c | \ sort -nr | $ cat tweets.json | \ jq '.text' | \ # select the text field on my JSON tr 'A-Z' 'a-z' | \ # convert text to lower case egrep -oe'#[0-9a-z_]+' | \ # select the hashtag sort | uniq -c | \ # count the number of different hashtags sort -nr | head -10 # reverse sort by frequency and get top 10 A couple of minutes later, the output was:
I would like to cache a result from a long computation in bash, so I set a variable with this value, for my surprise I got everything in one line, instead of the desired multi-line output.
It is a collection of programs for processing delimited-text data through the command line or using shell scripts.