Tag: bash
We run several processes that may take hours to complete and it is nice to be notified on a Slack channel when those processes finishes correctly. Using the Slack’s Incoming Webhooks API, a small bash script and a couple of tricks it is really simple!
Although tagcloud seems a little bit outdated and criticized visualization format, I have no doubt it can be useful sometimes. And if you can create one with only a few key strokes it is pretty sweet. Below I’ll show the technic of extracting Twitter #hashtags but you can use this technic to virtually any text source.
Most of our data are stored on MySQL and Cassandra, MySQL was the primary data-store when we started up the company. Currently our MySQL workload is located at AWS RDS and we would like to give a try to Microsoft Azure. This writing is to document a few tricks we learned to reduce the total time of dump, transfer and restore. Hope it can help you too.
Several tutorials have an assumption you own a data set. Often that is not the case and you just can’t take advantage of the tutorial because you don’t have data to play along. To comply with social networks Terms and Conditions you can’t publish your data sets, but you can create your own! Follow through these few commands.
This posts shows how to create heatmaps of conversations taking place on Twitter, this is a proof of concept technic to learn more about our current datasets, this knowledge would be latter applied to the product development cycle. My objective here is to share a simple way to create a quick visualization and be able to make an internal demo.
Working with JSON datasets is really common task nowadays, almost any API will output information on this format, but is still complex to manipulate this format when compared with plain-text combined with common unix commands like cut
, awk
, sed
, etc.
To reduce this gap jq
was developed with exactly this paradigm in mind jq is like sed for JSON data. This post will walk through the details to: select fields (projection), flatten arrays, filter jsons based on a field value and convert JSON to CSV/TSV.
Today I launched a spark job that was taking to long to complete and I forgot to start it through screen so I need find a way to keep it running after I disconnect my terminal of the cluster.
Often we have to work with JSON data sets, now and then data comes on CSV format. I received a great tip from @diegodellera who told me about textql - Execute SQL against structured text like CSV or TSV.
To maintain sessions on remote servers, as recommended by a friend, I started to use tmux.