grepの使い方いくつか

2016年05月30日

みなさん、grep使っていますか。本記事では、windows環境におけるgrepの使い方について説明します。

（といいつつも、自分がよく使うコマンドの備忘ですが…）

macやlinuxでも共通するコマンドを載せているつもりです。

grepとは

グレップと読みます。正規表現のルールに従って、ファイルの中身を検索する便利コマンドです。

wikipediaによると、

概要[編集] grep の名の由来は、ラインエディタedのコマンド g/re/p であり、その意味するところは「ファイル全体から/正規表現に一致する行を/表示する」[1]である。

のようです。g/re/pとは知りませんでした。

インストール

windowsで使う場合は、幾つか方法があります。

Gow を使う
Grep for Windows を使う

私は、Gowを使いました。GowはWindows上でリナックスコマンドを実行できるようにするソフトウェアです。

Cygwin等の導入なくLinuxコマンドが使えますし、コマンドプロンプト上で簡単に利用できるので、私のようなにわかにはピッタリです。

インストールは、毎度おなじみChocolateyで行いました。

cinst -y gow

macやlinuxですと最初からインストールされていますね。

使い方

幾つか、パターンを身につけていけば後は調べながらやればいいと思います。

ヘルプを見る

grep --help

結果はこちら。

Usage: grep \[OPTION\]... PATTERN \[FILE\]...
Search for PATTERN in each FILE or standard input.
PATTERN is, by default, a basic regular expression (BRE).
Example: grep -i 'hello world' menu.h main.c

Regexp selection and interpretation:
  -E, --extended-regexp     PATTERN is an extended regular expression (ERE)
  -F, --fixed-strings       PATTERN is a set of newline-separated fixed strings
  -G, --basic-regexp        PATTERN is a basic regular expression (BRE)
  -P, --perl-regexp         PATTERN is a Perl regular expression
  -e, --regexp=PATTERN      use PATTERN for matching
  -f, --file=FILE           obtain PATTERN from FILE
  -i, --ignore-case         ignore case distinctions
  -w, --word-regexp         force PATTERN to match only whole words
  -x, --line-regexp         force PATTERN to match only whole lines
  -z, --null-data           a data line ends in 0 byte, not newline

Miscellaneous:
  -s, --no-messages         suppress error messages
  -v, --invert-match        select non-matching lines
  -V, --version             print version information and exit
      --help                display this help and exit
      --mmap                use memory-mapped input if possible

Output control:
  -m, --max-count=NUM       stop after NUM matches
  -b, --byte-offset         print the byte offset with output lines
  -n, --line-number         print line number with output lines
      --line-buffered       flush output on every line
  -H, --with-filename       print the filename for each match
  -h, --no-filename         suppress the prefixing filename on output
      --label=LABEL         print LABEL as filename for standard input
  -o, --only-matching       show only the part of a line matching PATTERN
  -q, --quiet, --silent     suppress all normal output
      --binary-files=TYPE   assume that binary files are TYPE;
                            TYPE is \`binary', \`text', or \`without-match'
  -a, --text                equivalent to --binary-files=text
  -I                        equivalent to --binary-files=without-match
  -d, --directories=ACTION  how to handle directories;
                            ACTION is \`read', \`recurse', or \`skip'
  -D, --devices=ACTION      how to handle devices, FIFOs and sockets;
                            ACTION is \`read' or \`skip'
  -R, -r, --recursive       equivalent to --directories=recurse
      --include=FILE\_PATTERN  search only files that match FILE\_PATTERN
      --exclude=FILE\_PATTERN  skip files and directories matching FILE\_PATTERN
      --exclude-from=FILE   skip files matching any file pattern from FILE
      --exclude-dir=PATTERN directories that match PATTERN will be skipped.
  -L, --files-without-match print only names of FILEs containing no match
  -l, --files-with-matches  print only names of FILEs containing matches
  -c, --count               print only a count of matching lines per FILE
  -T, --initial-tab         make tabs line up (if needed)
  -Z, --null                print 0 byte after FILE name

Context control:
  -B, --before-context=NUM  print NUM lines of leading context
  -A, --after-context=NUM   print NUM lines of trailing context
  -C, --context=NUM         print NUM lines of output context
  -NUM                      same as --context=NUM
      --color\[=WHEN\],
      --colour\[=WHEN\]       use markers to highlight the matching strings;
                            WHEN is \`always', \`never', or \`auto'
  -U, --binary              do not strip CR characters at EOL (MSDOS)
  -u, --unix-byte-offsets   report offsets as if CRs were not there (MSDOS)

\`egrep' means \`grep -E'.  \`fgrep' means \`grep -F'.
Direct invocation as either \`egrep' or \`fgrep' is deprecated.
With no FILE, or when FILE is -, read standard input.  If less than two FILEs
are given, assume -h.  Exit status is 0 if any line was selected, 1 otherwise;
if any error occurs and -q was not given, the exit status is 2.

Report bugs to: bug-grep@gnu.org
GNU Grep home page: <http://www.gnu.org/software/grep/>
General help using GNU software: <http://www.gnu.org/gethelp/>

macやlinuxなら下記のコマンドでもOKですね。

man grep

※man = manual の略

特定の拡張子から、LANGキーワードがあるファイルを抽出する

WordPressをインストールした際に、使ったコマンドです。LANGとか書かれているphpファイルを探しました。

grep “LANG” \*.php

カレント以下で、search_textを含んだものを検索

カレントディレクトリ以下の *.java から hogehogeという文字を検索する場合は、下記のように検索します。（だれかが、ログにhogehogeという文字列を残しておいたんです…）。

rは再帰的に、.はカレントディレクトリという意味ですね。

grep -r --include='\*.java' 'hogehoge' .

検索しつつ行番号を出す

-n か --line-number をオプションに加えるだけです。

grep -r -n --include='\*.java' 'hogehoge' .

バイナリもテキストファイルとしてgrepする

-n か --line-number をオプションに加えるだけです。

grep -r -n --include='\*.java' 'hogehoge' .

動作しているプロセスを抜き出す

grep単体で使うことも多いですが、パイプ（リダイレクト）をすることも多々あります。

例：特定のプロセスが生きているかどうか確認したり、特定の名前のログファイルが存在するか、など

\# httpd(apache）関連のプロセスのみ抜き出す。
ps aux | grep httpd
# binaryファイルも無理やりテキスト形式でよみとりつつ、KEY.\*12345という正規表現にマッチするファイルを表示する
grep -ar KEY.\*12345

まとめ

IT・Webエンジニアのコンピュータ操作のほとんどは、テキスト操作がメインです。僕はまだまだコマンドを調べながら、というレベルですが周りの人はホント爆速でターミナルからエディタへ、エディタからターミナルへファイルを行き来しています。

そんな操作を行えるようになるための一歩目がgrepコマンドだと思います。構造を理解したら、OSSをDLしてみてソースコードを調べるのにgrepを使ってみるのも良い練習になるかもしれません。（IDE等でもできますが、あえて）。

Think Simple Enjoy Life