みずぴー日記

単語の頻度のカウント

Scheme 30分プログラム

30分プログラム、その192。前回の単語分割プログラム(id:mzp:20071126)を使って、単語頻度をカウントする。
コンピュータプログラミングの概念・技法・モデル(asin:4798113468)と同じことをやっているだけ。
確か、trとsortとuniqを併用して同じことができた気がするけど、気にしない。

使い方

$ ./lfreq word-split.scm
Author: 1
C: 1
Copyright: 1
Hiroki: 1
MIZUNO: 1
Scheme: 1
This: 1
Timestamp: 1
accumulate: 3
and: 2

ソースコード

初めてloadを使った。useとかrequireとかいろいろあるけど、単純にファイルを読み込みたいときはloadでいいんだよね?

#! /opt/local/bin/gosh
;; -*- mode:scheme; coding:utf-8 -*-
;;
;; freq.scm - word freq
;;
;; Copyright(C) 2007 by mzp
;; Author: MIZUNO Hiroki / mzpppp at gmail dot com
;; http://howdyworld.org
;;
;; Timestamp: 2007/11/27 15:44:20
;;
;; This program is free software; you can redistribute it and/or
;; modify it under the same terms as Scheme itself.
;;

(load "./191-word-split")

;; count word
(define (count-word word-list)
  (let1 table (make-hash-table 'string=?)
    (for-each (lambda (word)
		(hash-table-put! table
				 word
				 (+ 1
				    (hash-table-get table word 0))))
	      word-list)
    table))

;; like hash-table-for-each. but key is sorted
(define (sort-each table f)
  (let1 keys (sort (hash-table-keys table))
    (for-each (lambda (key) (f key (hash-table-get table key)))
	      keys)))

;; count words in file
(define (count-file file)
  (sort-each (count-word (split-word (port->string (open-input-file file))))
	     (lambda (key value) (format #t "~A: ~A~%" key value))))

(define (main argv)
  (for-each count-file (cdr argv)))

参考