Using the ‘du’ and ‘sort’ commands, we can get a listing of the largest directories in a given directory. For instance, in my home directory:

jason@mintSandbox ~ $ du -h --max-depth 1 | sort -hr
7.3G	.
4.7G	./data
1.4G	./mount0
391M	./.cache
389M	./bash
258M	./tmp
104M	./texts
35M	./python
28M	./.mozilla
12M	./.adobe
11M	./.config
8.8M	./bin
8.3M	./Downloads
3.7M	./.thumbnails
1.8M	./presentations
1.3M	./.macromedia
884K	./.gstreamer-0.10
524K	./.gimp-2.8
508K	./misc
388K	./.purple
164K	./.local
152K	./.java
144K	./.netExtenderCerts
140K	./.kde
72K	./scripts
64K	./.gconf
36K	./.pki
36K	./.gftp
32K	./.gnome2
28K	./.ssh
16K	./.linuxmint
12K	./.dbus
4.0K	./mount2
4.0K	./mount1
4.0K	./.gnome2_private
4.0K	./Desktop

Here, the ‘du’ command (standing for ‘disk usage’) estimates disk space taken by directories. The ‘-h’ option tells ‘du’ to make the output human readable. The ‘–max-depth 1′ option tells ‘du’ not to dig down within folders. (Try issuing the command without this option and see what happens.) The ‘sort’ command then takes the output of ‘du’ and sorts it by human readable numbers (again, the ‘-h’ option). The extra ‘-r’ option simply tells sort to reverse the sort, so the largest folders come first. We could pipe this to ‘head’ to reduce the number of rows returned to, say, 5 rows:

jason@mintSandbox ~ $ du -h --max-depth 1 | sort -hr | head -n 5
7.3G	.
4.7G	./data
1.4G	./mount0
391M	./.cache
389M	./bash

Say you have a bunch of subdirectories of your current working directory, all including variously named files. You want to iterate over those files and apply some Bash command.

For instance, I have folders named 01, 02, 03, …, 31 (representing days in a month), and inside each of those folders sits various files. I wish to gzip each of those files individually. Here’s how I do that with a single line in Bash:

$ for d in */; do for f in $d*; do echo "gzip ${f}";gzip ${f};done; done

The [date] program in Linux is incredibly powerful, and can be used to modify dates very quickly. Here are some examples.

The current time in my present locale:

$ date
Thu Nov 13 15:41:12 MST 2014

The current time in UTC: (Use UTC, not GMT. One is an international standard of keeping time based on atomic clocks while the other is a local old-fashioned timezone based on when the sun is highest in the sky … which isn’t exactly accurate enough for international business. Plus, GMT isn’t even used when daylight saving time is in effect. Anyway…)

$ date --utc
Thu Nov 13 22:44:06 UTC 2014

The time one day ago in UTC:

$ date --utc -d "now -1 day"
Wed Nov 12 22:44:51 UTC 2014

A specific date minus one day, formatted as we wish:

date -d "2014-10-01 -1 day" +%Y-%m-%d
2014-09-30

See the manual page for date for more options.


Say you want to transfer a large file over your network with scp, but you don’t want to hoard all of the network resources for this transfer. You can limit scp’s bandwidth usage by using the -l (lower case L) option and specifying your bandwidth limit in Kbit/s.

So, if I want to transfer a file and limit the bandwidth used to 1MB/s, I’d first compute that 1MB is equal to 1024KB, which is again equal to 8192Kb. (Here B=Byte and b=bit.) So, we end up with the command:

$ scp -l 8192 file_to_transfer user@host:/path-on-other-end