Unix & Linux Commands Cookbook
2025-02-19 workflow unix programming
HUM307 Command-line Tips and Tricks from Brian Kernighan (archived)
Problem solving with Unix commands Vegard Stikbakke (archived)
Download a whole website
wget -w 2 -r -np -k -p www.example.com
Extract text of a webpage
curl www.google.com | lynx -dump -stdin
-dump
(extract the text), -stdin
(read from the pipe)
Search directory tree for a word in file name (tree depth by 2, case insensitive)
tree -L 2 | grep 'lisp' -i
-L
: (limit the depth), -i
: (ignore case)
Find files modified within last 3 days
find . -mtime -3
-mtime -3
(modified within last 3 days)
- You can use
+3
instead for finding files modified before last 3 days
Add same line to all files recursively in a dir
find . -type f -name "*.md" -exec sh -c 'echo "This line will be added to all" >> "$0"' {} \;
Move all files in subdirectories to the current path
find . -mindepth 2 -type f -exec mv {} . \;
-mindepth 2
ensures we only get files in subdirs (not those already in current dir)
Remove all empty directories
find . -mindepth 1 -type d -empty -delete
-mindepth 1
ensures we don't try deleting the current directory
Find large files over 100MB and sort them by size
find . -type f -size +100M -exec ls -lh {} \; | sort -rh -k5
-size +100M
(files larger than 100MB), sort -rh
(reverse human-readable sort)
Find and replace text in multiple files
find . -type f -name "*.txt" -exec sed -i 's/oldtext/newtext/g' {} +
Find duplicate files based on content (not name)
find . -type f -exec md5sum {} \; | sort | uniq -w32 -dD
Create a simple HTTP server in current directory
python3 -m http.server 8080
Generate a tree view of directory excluding certain patterns
tree -I 'node_modules|cache|tmp|vendor|.git' --dirsfirst -aC
Remove empty lines from a file
sed -i '' '/^[:space:](/dead)*$/d' file-path
Find all URLs in a directory, and clean them from http(s) prefixes and trailing slashes, and list all
find . -type f -name "*.[txt|md]" -exec perl -lne 'print \(1 while /(https?:\/\/[^\s)\]]+)/g' {} \; | sed -e 's|^https://||;s|^http://||' -e 's/\/\)//'
Compare URL's in two files: find all URLs in a directory, and clean them from http(s) prefixes and trailing slashes, and list all
capture_urls() { perl -lne 'print \(1 while /(https?:\/\/[^\s)\]]+)/g' "\)1" | sed -e 's|^https://||;s|^http://||' -e 's/\/$//' }
diff <(capture_urls file1.md) <(capture_urls file2.md)
diff <(...) <(...)
: Compare output of two commandsFor each file:
Extract URLs using perl (
perl -lne 'print $1 while /(https?:\/\/[^\s)\]]+)/g'
)Clean URLs by removing http(s) prefix and trailing slashes using sed
Shows differences between the two files:
Lines starting with
<
appear only in file1Lines starting with
>
appear only in file2No output means the files contain the same URLs
Copy-paste in pipeline
pbpaste | <your-command>
Prettify youtube transcripts
Copy transcript to clipboard (have a format with bunch of timestamps and new lines)
Run the following
pbpaste | sed 's/[0-9]:[0-9][0-9]//g' | tr -d '\n'
References
Incoming Internal References (0)
Outgoing Internal References (3)
-
---
[[HUM307 Command-line Tips and Tricks from Brian Kernighan (archived)]]
[[Problem solving with Unix commands Vegard Stikbakke (archived)]] -
[[HUM307 Command-line Tips and Tricks from Brian Kernighan (archived)]]
[[Problem solving with Unix commands Vegard Stikbakke (archived)]]
-
`bash
sed -i '' '/^[[:space:]]*$/d' file-path
`
Outgoing Web References (3)
-
blog.sanctum.geek.nz/series/unix-as-ide
- blog.sanctum.geek.nz/series/unix-as-ide/
-
balajicourse.s3.us-east-2.amazonaws.com/Startup+Engineering+Full+Course.pdf
- Balaji's Startup Engineering Slides
-
web.archive.org/web/20101101173837/http://teddziuba.com/2010/10/taco-bell-programming.html
- Taco Bell Programming Post