<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>toyosystem &#187; 検索エンジン</title>
	<atom:link href="http://www.jamboree.jp/cms/archives/category/ruby/%e6%a4%9c%e7%b4%a2%e3%82%a8%e3%83%b3%e3%82%b8%e3%83%b3/feed" rel="self" type="application/rss+xml" />
	<link>http://www.jamboree.jp/cms</link>
	<description>名古屋在住のWebプログラマー</description>
	<lastBuildDate>Fri, 26 Aug 2011 12:41:42 +0000</lastBuildDate>
	<language>ja</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>RubyでHTMLを取得 &#8211; 検索エンジンを作ろう</title>
		<link>http://www.jamboree.jp/cms/archives/459</link>
		<comments>http://www.jamboree.jp/cms/archives/459#comments</comments>
		<pubDate>Thu, 09 Jul 2009 20:32:59 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[検索エンジン]]></category>

		<guid isPermaLink="false">http://www.jamboree.jp/cms/?p=459</guid>
		<description><![CDATA[Rubyでつくる検索エンジン posted with amazlet at 09.07.10 星澤 隆 毎日コミュニケーションズ 売り上げランキング: 82686 Amazon.co.jp で詳細を見る この本を読んだので、僕も自分の検索エンジンを作ってみようと思いました。 しかもせっかくなので本を踏襲しながらもオリジナルで作ってみたいなと思ってしまった（よくある失敗の原因ですね）。 とりあえずdRubyの本を半年ぐらい借りており、持ち主に返したいのでdRubyをクローラーのところに使いたい。 クローラー・スパイダー PLAIN TEXT RUBY: require "hpricot" require "open-uri" require 'kconv' &#160; uri = ARGV&#91;0&#93; &#160; class Crowler &#160; def initialize&#40;uri&#41; &#160; &#160; @uri = uri &#160; &#160; @title = nil &#160; &#160; @description = nil &#160; &#160; @src = [...]]]></description>
			<content:encoded><![CDATA[<div class="amazlet-box" style="margin-bottom:0px;">
<div class="amazlet-image" style="float:left;"><a href="http://www.amazon.co.jp/exec/obidos/ASIN/4839931496/jamboree0f-22/ref=nosim/" name="amazletlink" target="_blank"><img src="http://ecx.images-amazon.com/images/I/41PSJr4B4xL._SL160_.jpg" alt="Rubyでつくる検索エンジン" style="border: none;" /></a></div>
<div class="amazlet-info" style="float:left;margin-left:15px;line-height:120%">
<div class="amazlet-name" style="margin-bottom:10px;line-height:120%"><a href="http://www.amazon.co.jp/exec/obidos/ASIN/4839931496/jamboree0f-22/ref=nosim/" name="amazletlink" target="_blank">Rubyでつくる検索エンジン</a>
<div class="amazlet-powered-date" style="font-size:7pt;margin-top:5px;font-family:verdana;line-height:120%">posted with <a href="http://www.amazlet.com/browse/ASIN/4839931496/jamboree0f-22/ref=nosim/" title="Rubyでつくる検索エンジン" target="_blank">amazlet</a> at 09.07.10</div>
</div>
<div class="amazlet-detail">星澤 隆 <br />毎日コミュニケーションズ <br />売り上げランキング: 82686</div>
<div class="amazlet-link" style="margin-top: 5px"><a href="http://www.amazon.co.jp/exec/obidos/ASIN/4839931496/jamboree0f-22/ref=nosim/" name="amazletlink" target="_blank">Amazon.co.jp で詳細を見る</a></div>
</div>
<div class="amazlet-footer" style="clear: left"></div>
</div>
<p>この本を読んだので、僕も自分の検索エンジンを作ってみようと思いました。<br />
しかもせっかくなので本を踏襲しながらもオリジナルで作ってみたいなと思ってしまった（よくある失敗の原因ですね）。</p>
<p>とりあえずdRubyの本を半年ぐらい借りており、持ち主に返したいのでdRubyをクローラーのところに使いたい。</p>
<h2>クローラー・スパイダー</h2>
<div class="igBar"><span id="lruby-2"><a href="#" onclick="javascript:showCodeTxt('ruby-2'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">RUBY:</span>
<div id="ruby-2">
<div class="ruby">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">"hpricot"</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">"open-uri"</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#CC0066; font-weight:bold;">require</span> 'kconv'</div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">uri = ARGV<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;color:#800000;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#9966CC; font-weight:bold;">class</span> Crowler</div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; <span style="color:#9966CC; font-weight:bold;">def</span> initialize<span style="color:#006600; font-weight:bold;">&#40;</span>uri<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; @uri = uri</div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; @title = <span style="color:#0000FF; font-weight:bold;">nil</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; @description = <span style="color:#0000FF; font-weight:bold;">nil</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; @src = <span style="color:#0000FF; font-weight:bold;">nil</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; <span style="color:#9966CC; font-weight:bold;">end</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; <span style="color:#9966CC; font-weight:bold;">def</span> get</div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">begin</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; doc = <span style="color:#CC0066; font-weight:bold;">open</span><span style="color:#006600; font-weight:bold;">&#40;</span>@uri<span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; @src = Hpricot<span style="color:#006600; font-weight:bold;">&#40;</span>doc.<span style="color:#9900CC;">read</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">toutf8</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">rescue</span> =&gt; ex</div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; <span style="color:#0000FF; font-weight:bold;">return</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">end</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; @title = <span style="color:#006600; font-weight:bold;">&#40;</span>@src/:title<span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">inner_html</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; desc_element = @src.<span style="color:#9900CC;">search</span><span style="color:#006600; font-weight:bold;">&#40;</span>'meta<span style="color:#006600; font-weight:bold;">&#91;</span>@name=<span style="color:#996600;">"description"</span><span style="color:#006600; font-weight:bold;">&#93;</span>'<span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">first</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; @desc = desc_element ? desc_element<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#996600;">"content"</span><span style="color:#006600; font-weight:bold;">&#93;</span> : <span style="color:#996600;">""</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; <span style="color:#9966CC; font-weight:bold;">end</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#9966CC; font-weight:bold;">end</span></div>
</li>
<li style="font-weight: bold;color:IG_LINE_COLOUR_2;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:IG_LINE_COLOUR_1;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">Crowler.<span style="color:#9900CC;">new</span><span style="color:#006600; font-weight:bold;">&#40;</span>uri<span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">get</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<h2>まとめ</h2>
<p>面倒なところは飛ばしてとりあえず、小さいのを完成させよう</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jamboree.jp/cms/archives/459/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

