<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>soledad penadés &#187; hpricot</title>
	<atom:link href="http://soledadpenades.com/tag/hpricot/feed/" rel="self" type="application/rss+xml" />
	<link>http://soledadpenades.com</link>
	<description>repeat 4[fd 100 rt 90]</description>
	<lastBuildDate>Sun, 29 Jan 2012 23:03:40 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>How to install hpricot in Ubuntu 8.4</title>
		<link>http://soledadpenades.com/2008/10/24/how-to-install-hpricot-in-ubuntu-84/</link>
		<comments>http://soledadpenades.com/2008/10/24/how-to-install-hpricot-in-ubuntu-84/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 09:56:59 +0000</pubDate>
		<dc:creator>sole</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[hpricot]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://soledadpenades.com/?p=778</guid>
		<description><![CDATA[This could be considered a fresh installation, speaking in ruby terms. I just had ruby installed, no ruby gems, nor ruby dev nor anything else ruby. So this should be enough for installing hpricot as well as ruby gems (which are required for installing hpricot). As you can see, I didn&#8217;t download any source file, [...]]]></description>
			<content:encoded><![CDATA[<p>This could be considered a fresh installation, speaking in ruby terms. I just had ruby installed, no ruby gems, nor ruby dev nor anything else ruby. So this should be enough for installing hpricot as well as ruby gems (which are required for installing hpricot). </p>
<p>As you can see, I didn&#8217;t download any source file, instead I was happy with using apt-get and the hpricot version from ubuntu repositories, although they are relatively old (for example rubygems is more than a year old). If I find any problem and need to update to newer versions I&#8217;ll report that here ;-)</p>
<div class="syhi_block"><code><span style="color: #c20cb9; font-weight: bold;">sudo</span> <span style="color: #c20cb9; font-weight: bold;">apt-get</span> <span style="color: #c20cb9; font-weight: bold;">install</span> rubygems<br />
<span style="color: #c20cb9; font-weight: bold;">sudo</span> <span style="color: #c20cb9; font-weight: bold;">rm</span> <span style="color: #000000; font-weight: bold;">/</span>var<span style="color: #000000; font-weight: bold;">/</span>lib<span style="color: #000000; font-weight: bold;">/</span>gems<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">1.8</span><span style="color: #000000; font-weight: bold;">/</span>source_cache<br />
<span style="color: #c20cb9; font-weight: bold;">sudo</span> gem update<br />
<span style="color: #c20cb9; font-weight: bold;">sudo</span> <span style="color: #c20cb9; font-weight: bold;">apt-get</span> <span style="color: #c20cb9; font-weight: bold;">install</span> ruby1.8-dev<br />
<span style="color: #c20cb9; font-weight: bold;">sudo</span> gem <span style="color: #c20cb9; font-weight: bold;">install</span> hpricot</code></div>
<p>It&#8217;s a pity they don&#8217;t have a metapackage for ruby&#8217;s development files (the ruby1.8-dev package), the same way there&#8217;s a <strong>ruby</strong> metapackage which depends on the <strong>ruby1.8</strong> package, so whenever ruby is updated it will update the ruby version as well, without the user having to worry about the version number.</p>
<p>Even more, I instinctively tried a <em>naive</em><strong> sudo apt-get install rubydev</strong> and was greeted with a sad<em> &#8220;Couldn&#8217;t find package rubydev&#8221;</em>. It somehow proves that a metapackage called rubydev would be quite useful&#8230; at least for instinctive users.</p>
<p>Enjoy your screen scrapping!</p>
]]></content:encoded>
			<wfw:commentRss>http://soledadpenades.com/2008/10/24/how-to-install-hpricot-in-ubuntu-84/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Parsing a del.icio.us export with Hpricot</title>
		<link>http://soledadpenades.com/2008/03/25/parsing-a-delicious-export-with-hpricot/</link>
		<comments>http://soledadpenades.com/2008/03/25/parsing-a-delicious-export-with-hpricot/#comments</comments>
		<pubDate>Tue, 25 Mar 2008 08:54:11 +0000</pubDate>
		<dc:creator>sole</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[bookmarks]]></category>
		<category><![CDATA[data scrapping]]></category>
		<category><![CDATA[delicious]]></category>
		<category><![CDATA[hpricot]]></category>
		<category><![CDATA[ruby]]></category>

		<guid isPermaLink="false">http://soledadpenades.com/2008/03/25/parsing-a-delicious-export-with-hpricot/</guid>
		<description><![CDATA[The trickiest part is to detect if a bookmark has a corresponding description. The export is in the same format that Netscape used for its bookmarks export, which means it is a simple html file with a definition list (dl) and a series of definition terms (dt). A term (=bookmarks) may have a description (dd). [...]]]></description>
			<content:encoded><![CDATA[<p>The trickiest part is to detect if a bookmark has a corresponding description. The export is in the same format that Netscape used for its bookmarks export, which means it is a simple html file with a definition list (<strong>dl</strong>) and a series of definition terms (<strong>dt</strong>). A term (=bookmarks) may have a description (<strong>dd</strong>).</p>
<p>But how do you detect if there&#8217;s a description? It seems the answer was rather simple: use <strong>term.next</strong> and if the <em>next</em> element&#8217;s name is <em>dd</em>, we&#8217;re lucky and have a description. The only problem was that I didn&#8217;t know how to access the name of an element, until I just thought: what if I simply use <em>name</em>? and guess what&#8230; it worked! So term.next.name was exactly what I looked for :-)</p>
<div class="syhi_block"><code><span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'rubygems'</span><br />
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'hpricot'</span><br />
<br />
doc = <span style="color:#CC0066; font-weight:bold;">open</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;bookmarks.html&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">|</span>f<span style="color:#006600; font-weight:bold;">|</span> Hpricot<span style="color:#006600; font-weight:bold;">&#40;</span>f<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#125;</span><br />
<br />
bookmarks = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span><br />
<br />
<span style="color:#006600; font-weight:bold;">&#40;</span>doc<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot;dl/dt&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>term<span style="color:#006600; font-weight:bold;">|</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; link = <span style="color:#006600; font-weight:bold;">&#40;</span>term<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot;a&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">if</span> term.<span style="color:#9966CC; font-weight:bold;">next</span> <span style="color:#9966CC; font-weight:bold;">and</span> term.<span style="color:#9966CC; font-weight:bold;">next</span>.<span style="color:#9900CC;">name</span> == <span style="color:#996600;">'dd'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; desc = term.<span style="color:#9966CC; font-weight:bold;">next</span>.<span style="color:#9900CC;">inner_text</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; desc = <span style="color:#0000FF; font-weight:bold;">nil</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">end</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">if</span> link.<span style="color:#9900CC;">attr</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'tags'</span><span style="color:#006600; font-weight:bold;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tags = link.<span style="color:#9900CC;">attr</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'tags'</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#CC0066; font-weight:bold;">split</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;,&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tags = <span style="color:#0000FF; font-weight:bold;">nil</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">end</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; bookmarks <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> <span style="color:#006600; font-weight:bold;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#ff3333; font-weight:bold;">:address</span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">=&gt;</span>&nbsp; &nbsp; &nbsp; link.<span style="color:#9900CC;">attr</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'href'</span><span style="color:#006600; font-weight:bold;">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#ff3333; font-weight:bold;">:created_at</span> &nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">=&gt;</span>&nbsp; &nbsp; &nbsp; link.<span style="color:#9900CC;">attr</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'last_visit'</span><span style="color:#006600; font-weight:bold;">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#ff3333; font-weight:bold;">:tags</span> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">=&gt;</span>&nbsp; &nbsp; &nbsp; tags,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#ff3333; font-weight:bold;">:description</span>&nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">=&gt;</span>&nbsp; &nbsp; &nbsp; desc,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#ff3333; font-weight:bold;">:title</span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">=&gt;</span>&nbsp; &nbsp; &nbsp; link.<span style="color:#9900CC;">inner_text</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
<span style="color:#9966CC; font-weight:bold;">end</span></code></div>
<p><a href="http://github.com/sole/snippets/blob/master/web/scrapping/delicious_dump_parse/extract.rb">Source</a> at supersnippets.</p>
<p>I also extended this a bit to save the results into a database, using ActiveRecord, but since <em>each db schema is a different world</em>, I didn&#8217;t post that version here. If anybody thinks it might be useful just let me know.</p>
<p>Also, this code is not very <em>rubyesque</em> yet, suggestions in order to improve it will be really appreciated. I&#8217;m specially thinking about the <em>if &#8230; else</em> parts, I&#8217;m pretty sure there&#8217;s a way to shorten those lines :-)</p>
 <p><a href="http://soledadpenades.com/?flattrss_redirect&amp;id=690&amp;md5=b18abb1777f8d959fefc87e1c4b5e248" title="Flattr" target="_blank"><img src="http://soledadpenades.com/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://soledadpenades.com/2008/03/25/parsing-a-delicious-export-with-hpricot/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Removing elements with Hpricot</title>
		<link>http://soledadpenades.com/2007/10/05/removing-elements-with-hpricot/</link>
		<comments>http://soledadpenades.com/2007/10/05/removing-elements-with-hpricot/#comments</comments>
		<pubDate>Fri, 05 Oct 2007 10:08:53 +0000</pubDate>
		<dc:creator>sole</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[hpricot]]></category>
		<category><![CDATA[ruby]]></category>

		<guid isPermaLink="false">http://www.soledadpenades.com/2007/10/05/removing-elements-with-hpricot/</guid>
		<description><![CDATA[Something like a month ago, a guy asked me how to remove elements with Hpricot. I told him I would look into it but it&#8217;s been a month already! So I hope I can compensate for the delay with this minitutorial on removing stuff with Hpricot! :-) First I created a simple test page. It&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>Something like a month ago, <a href="http://xbelanch.wordpress.com/">a guy</a> <a href="http://www.soledadpenades.com/2007/06/15/extracting-data-with-hpricot/#comment-44218">asked me</a> how to remove elements with Hpricot. I told him I would look into it but it&#8217;s been a month already! So I hope I can compensate for the delay with this minitutorial on removing stuff with Hpricot! :-)</p>
<p>First I created a simple test page. It&#8217;s got some html elements, some have id&#8217;s, some contain certain text nodes. It looks like this:</p>
<div class="syhi_block"><code><span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/p.html"><span style="color: #000000; font-weight: bold;">p</span></a>&gt;</span>This is a paragraph without attributes<span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/p.html"><span style="color: #000000; font-weight: bold;">p</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/p.html"><span style="color: #000000; font-weight: bold;">p</span></a> <span style="color: #000066;">id</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;bad_attribute&quot;</span>&gt;</span>This is a paragraph with one attribute: id=bad_attribute<span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/p.html"><span style="color: #000000; font-weight: bold;">p</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/ul.html"><span style="color: #000000; font-weight: bold;">ul</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span>Element 1<span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span>This will be removed because the text doesn't begin with an E<span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/ul.html"><span style="color: #000000; font-weight: bold;">ul</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/ul.html"><span style="color: #000000; font-weight: bold;">ul</span></a> <span style="color: #000066;">id</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;second_list&quot;</span> <span style="color: #000066;">style</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;border:1px solid red;&quot;</span>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span>Element 1 in the list with id=second_list<span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span>element 2<span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/ul.html"><span style="color: #000000; font-weight: bold;">ul</span></a>&gt;</span></code></div>
<p>The question was how to remove certain individual elements given certain conditions &#8211; more specifically, when the element attributes matched a condition. I don&#8217;t see why he had problems removing stuff with the <strong>remove</strong> method, since that&#8217;s what I have used. Since <strong>search</strong> returns a collection of elements, you just need to get a collection which contains only the element you want to remove, and then apply <strong>remove</strong> to that collection.</p>
<p>Here are three examples:</p>
<h3>Removing the paragraph with id = bad_attribute</h3>
<p>We find out the element using CSS selectors, where the hash means &#8216;id&#8217;.</p>
<div class="syhi_block"><code>doc.<span style="color:#9900CC;">search</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;p#bad_attribute&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">remove</span></code></div>
<h3>Removing all the unordered lists (ul&#8217;s) which have an style attribute</h3>
<p>Again, using CSS selectors:</p>
<div class="syhi_block"><code>doc.<span style="color:#9900CC;">search</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;ul[@style]&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">remove</span></code></div>
<p>There&#8217;s more info about CSS selectors in the <a href="http://code.whytheluckystiff.net/hpricot/wiki/HpricotCssSearch">Hpricot CSS search documentation</a>. One can get very creative with this and allows for filtering almost everything!</p>
<h3>Removing elements whose contents match certain conditions</h3>
<p>When it&#8217;s not enough with CSS selectors, we can perfectly take advantage of ruby!</p>
<p>For example, if you want to remove list items (li&#8217;s) whose text doesn&#8217;t begin with E, you could do it with this:</p>
<div class="syhi_block"><code>doc.<span style="color:#9900CC;">search</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;li&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">collect</span>!<span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">|</span>node<span style="color:#006600; font-weight:bold;">|</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; node <span style="color:#9966CC; font-weight:bold;">if</span> <span style="color:#9966CC; font-weight:bold;">not</span> <span style="color:#006600; font-weight:bold;">/</span>^E<span style="color:#006600; font-weight:bold;">/</span>.<span style="color:#9900CC;">match</span><span style="color:#006600; font-weight:bold;">&#40;</span>node.<span style="color:#9900CC;">inner_text</span><span style="color:#006600; font-weight:bold;">&#41;</span><br />
<span style="color:#006600; font-weight:bold;">&#125;</span>.<span style="color:#9900CC;">compact</span>.<span style="color:#9900CC;">remove</span></code></div>
<p>which is the same as saying:</p>
<ul>
<li>Look for every list item in the document</li>
<li>Take the results of that search (which is an Array of Hpricot Elements) and apply the <a href="http://www.ruby-doc.org/core/classes/Array.html#M002211">collect!</a> function to them</li>
<li><strong>collect!</strong> executes the code in the block for each element and stores the return value in an array</li>
<li>But as it can return nils (when the inner_text doesn&#8217;t begin with &#8216;E&#8217; and hence doesn&#8217;t match our little regular expression), we remove nil values from the array with <a href="http://www.ruby-doc.org/core/classes/Array.html#M002239">compact</a>, so that we don&#8217;t get errors when removing.</li>
<li>And finally, remove the elements which are in the resulting array, with the classical Hpricot remove</li>
</ul>
<p>Note how I used collect! instead of just collect, so that the changes are applied over the search results, and we don&#8217;t get a new array instead.</p>
<p>You should try using <strong>collect</strong> instead of <strong>collect!</strong>, and removing <strong>compact</strong> from the chain, to see what happens.</p>
<h3>Final result</h3>
<p>If one applies all these evil removals to the original code, the final result is this:</p>
<div class="syhi_block"><code><span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/p.html"><span style="color: #000000; font-weight: bold;">p</span></a>&gt;</span>This is a paragraph without attributes<span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/p.html"><span style="color: #000000; font-weight: bold;">p</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/ul.html"><span style="color: #000000; font-weight: bold;">ul</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span>Element 1<span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/li.html"><span style="color: #000000; font-weight: bold;">li</span></a>&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&lt;<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/ul.html"><span style="color: #000000; font-weight: bold;">ul</span></a>&gt;</span></code></div>
<p>Pretty empty, isn&#8217;t it?!</p>
<h3>Download these examples</h3>
<p>I&#8217;ve uploaded the hpricot_remove_elements.rb and test.html together in a zip file: <a href="/files/hpricot/hpricot_remove_elements.zip">hpricot_remove_elements.zip</a>. For running it, just unpack, and type ruby hpricot_remove_elements.rb</p>
<p>Or open with textmate and press Option+R ;-)</p>
]]></content:encoded>
			<wfw:commentRss>http://soledadpenades.com/2007/10/05/removing-elements-with-hpricot/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Extracting data with Hpricot</title>
		<link>http://soledadpenades.com/2007/06/15/extracting-data-with-hpricot/</link>
		<comments>http://soledadpenades.com/2007/06/15/extracting-data-with-hpricot/#comments</comments>
		<pubDate>Thu, 14 Jun 2007 23:01:17 +0000</pubDate>
		<dc:creator>sole</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[data scrapping]]></category>
		<category><![CDATA[exif]]></category>
		<category><![CDATA[firebug]]></category>
		<category><![CDATA[hpricot]]></category>
		<category><![CDATA[jquery]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[xpath]]></category>

		<guid isPermaLink="false">http://www.soledadpenades.com/2007/06/15/extracting-data-with-hpricot/</guid>
		<description><![CDATA[For those (few) of you which haven&#8217;t heard about it, Hpricot is a nice library for parsing HTML in ruby, created by the even nicer _whytheluckystiff, author of Poignant&#8217;s Guide to Ruby, Camping and other ruby gems (may you excuse the pun? it was impossible to avoid it). Since I saw one demonstration by Rob [...]]]></description>
			<content:encoded><![CDATA[<p>For those (few) of you which haven&#8217;t heard about it, <a href="http://code.whytheluckystiff.net/hpricot/">Hpricot</a> is a nice library for parsing HTML in ruby, created by the even nicer <a href="http://whytheluckystiff.net/">_whytheluckystiff</a>, author of <a href="http://poignantguide.net/ruby/">Poignant&#8217;s Guide to Ruby</a>, <a href="http://code.whytheluckystiff.net/camping/">Camping</a> and other <em>ruby gems</em> (may you excuse the pun? it was impossible to avoid it).</p>
<p>Since I saw one demonstration by Rob McKinnon at <a href="http://www.soledadpenades.com/2007/03/13/london-ruby-users-group-brings-you-back-to-uni/">certain LRUG meeting</a>, I have been willing to try Hpricot, but I hadn&#8217;t seen an application for it yet. No more! I found myself today wanting to extract data from a table in a web page and suddenly I thought: <q>this is a job for Hpricot!</q>. More specifically, I wanted to extract <a href="http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/EXIF.html">these EXIF tags</a>, and I simply couldn&#8217;t accept the mere thinking of entering that data manually. It needed to be automated!</p>
<h3>Getting it</h3>
<p><strong>Getting Hpricot</strong> is very easy:
<div class="syhi_block"><code><span style="color: #c20cb9; font-weight: bold;">sudo</span> gem <span style="color: #c20cb9; font-weight: bold;">install</span> hpricot</code></div>
<p> (if you&#8217;re picky you can try more exotic ways of installing in its homepage).
<div class="syhi_block"><code>gem <span style="color: #c20cb9; font-weight: bold;">install</span> hpricot</code></div>
<p> if you&#8217;re in windows, of course.</p>
<p><strong>Understanding it</strong> is easy as well, specially if you have used <a href="http://jquery.com/">jquery</a> before. It&#8217;s all about writing selectors for looking for things, so it helps a lot if the HTML document is well marked. Otherwise, you might have to end up doing lots of workarounds or extra code that could be avoided simply by having a class or id specified in the relevant elements.</p>
<h3>Inspecting &amp; traversing</h3>
<p>So, once I got the library installed, I took a look at the page source code with <a href="http://www.getfirebug.com/">Firebug</a>. It is specially useful for this kind of jobs because it helps you to <strong>visualize the hierarchy of elements in the page</strong>, including classes and id&#8217;s, so you don&#8217;t have to traverse manually the HTML tree to gather the data you need.</p>
<p>What I was looking for was the table which contained the relevant data. In this case, we&#8217;re lucky and even if the table hasn&#8217;t got an id attribute which would make it uniquely identifiable in the whole document, it still has class=&#8221;inner&#8221;, which happens to be used only once in it, thus acting effectively as an element identifier.</p>
<p><img src="/imgs/firebug.png" alt="Firebug in action!" /></p>
<p>Note how Firebug is showing the tree path for the selected table. If we didn&#8217;t have the class attribute, we would need to use a selector like &#8220;/html/body/blockquote/table/tbody/tr/td/table&#8221;, but it will be something as simple as &#8220;/table.inner&#8221;.</p>
<h3>Hands on Ruby</h3>
<p>Ok, so this is where we write a few lines of code which do a lot ;-)</p>
<p>First come the usual series of requires:</p>
<div class="syhi_block"><code><span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'rubygems'</span><br />
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'hpricot'</span><br />
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'open-uri'</span></code></div>
<p><strong>Rubygems</strong> is required in order to load <strong>hpricot</strong>, and <strong>open-uri</strong> is required in order to directly read data from a URI. open-uri comes with ruby, so we don&#8217;t need to install anything else.</p>
<p>Now we need to get the HTML file. It is as simple as</p>
<div class="syhi_block"><code>doc = Hpricot<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#CC0066; font-weight:bold;">open</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/EXIF.html&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#41;</span></code></div>
<p>but since I was doing lots of tests and didn&#8217;t want to overload that guy&#8217;s server, I simply saved the document as EXIF.html and loaded it with this instead:</p>
<div class="syhi_block"><code>doc = <span style="color:#CC0066; font-weight:bold;">open</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;EXIF.html&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>f<span style="color:#006600; font-weight:bold;">|</span> Hpricot<span style="color:#006600; font-weight:bold;">&#40;</span>f<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#125;</span></code></div>
<p>At this point we have the HTML document in the doc variable, so what are we waiting for?<br />
We initialize a rows variable for holding the data that we&#8217;ll extract:</p>
<div class="syhi_block"><code>rows = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span></code></div>
<p>And now comes the real fun!</p>
<div class="syhi_block"><code><span style="color:#006600; font-weight:bold;">&#40;</span>doc<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot;table.inner//tr&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>row<span style="color:#006600; font-weight:bold;">|</span><br />
&nbsp; &nbsp; cells = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span><br />
&nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">&#40;</span>row<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot;td&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>cell<span style="color:#006600; font-weight:bold;">|</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">if</span> <span style="color:#006600; font-weight:bold;">&#40;</span>cell<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot; span.s&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">length</span> <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">0</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; values = <span style="color:#006600; font-weight:bold;">&#40;</span>cell<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot;span.s&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">inner_html</span>.<span style="color:#CC0066; font-weight:bold;">split</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'&lt;br /&gt;'</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">collect</span><span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>str<span style="color:#006600; font-weight:bold;">|</span> <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; pair = str.<span style="color:#9900CC;">strip</span>.<span style="color:#CC0066; font-weight:bold;">split</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'='</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">collect</span><span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">|</span>val<span style="color:#006600; font-weight:bold;">|</span> val.<span style="color:#9900CC;">strip</span><span style="color:#006600; font-weight:bold;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#CC00FF; font-weight:bold;">Hash</span><span style="color:#006600; font-weight:bold;">&#91;</span>pair<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span>, pair<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#006600; font-weight:bold;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">if</span><span style="color:#006600; font-weight:bold;">&#40;</span>values.<span style="color:#9900CC;">length</span>==<span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; cells <span style="color:#006600; font-weight:bold;">&lt;</span> <span style="color:#006600; font-weight:bold;">&lt;</span> cell.<span style="color:#9900CC;">inner_text</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">else</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; cells <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> values<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">end</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">elsif</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; cells <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> cell.<span style="color:#9900CC;">inner_text</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">end</span><br />
&nbsp; &nbsp; <span style="color:#9966CC; font-weight:bold;">end</span><br />
&nbsp; &nbsp; rows <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> cells<br />
&nbsp; &nbsp; <br />
<span style="color:#9966CC; font-weight:bold;">end</span></code></div>
<p>Ok, not that fast. I&#8217;ll elaborate a little more on the juicy bits.</p>
<div class="syhi_block"><code><span style="color:#006600; font-weight:bold;">&#40;</span>doc<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot;table.inner//tr&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>row<span style="color:#006600; font-weight:bold;">|</span></code></div>
<p>This is the key for reaching the main data. It&#8217;s like saying <q>I&#8217;m looking in doc for all the rows (the tr&#8217;s) which are contained in a table whose class equals &#8216;inner&#8217;</q>. When we use a / it means we want an immediate child. // means a child below the element. As I said before, it&#8217;s all about selecting and traversing the tree.</p>
<p>With the last line of code, we get returned the content of each tr into the <strong>row</strong> variable. We can continue extracting data from within <strong>row</strong>, and that&#8217;s exactly what we do with
<div class="syhi_block"><code><span style="color:#006600; font-weight:bold;">&#40;</span>row<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot;td&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>cell<span style="color:#006600; font-weight:bold;">|</span></code></div>
<p>That one provides us with all the td elements immediately below the current row.</p>
<p>When we reach the td elements, all that is left is to extract the data for each cell and push it into the cells array, which will be pushed into the rows array. But we don&#8217;t just copy the cell data as it is; some cells contain notes, and some of those notes contain lists of values. I think we can all agree that those lists of values are commonly called Hashes, and they undoubtedly deserve an special treatment!</p>
<div class="syhi_block"><code><span style="color:#9966CC; font-weight:bold;">if</span> <span style="color:#006600; font-weight:bold;">&#40;</span>cell<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot; span.s&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">length</span> <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">0</span></code></div>
<p>So that&#8217;s why I&#8217;m checking for the existance of an span with class == s inside each cell. If we find one, there&#8217;s a note in this row, and probably there&#8217;s one hash with values. I would say this is the funniest part of all:</p>
<div class="syhi_block"><code>values = <span style="color:#006600; font-weight:bold;">&#40;</span>cell<span style="color:#006600; font-weight:bold;">/</span><span style="color:#996600;">&quot;span.s&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">inner_html</span>.<span style="color:#CC0066; font-weight:bold;">split</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'&lt;br /&gt;'</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">collect</span><span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>str<span style="color:#006600; font-weight:bold;">|</span> <br />
&nbsp; pair = str.<span style="color:#9900CC;">strip</span>.<span style="color:#CC0066; font-weight:bold;">split</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'='</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">collect</span><span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">|</span>val<span style="color:#006600; font-weight:bold;">|</span> val.<span style="color:#9900CC;">strip</span><span style="color:#006600; font-weight:bold;">&#125;</span><br />
&nbsp; <span style="color:#CC00FF; font-weight:bold;">Hash</span><span style="color:#006600; font-weight:bold;">&#91;</span>pair<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span>, pair<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#93;</span><br />
<span style="color:#006600; font-weight:bold;">&#125;</span></code></div>
<p>I&#8217;m making use of the fact that each invoked function is returning another object, so that I can chain them consecutively instead of doing a series of assignments. And it reads like this: <q>Take the html inside the span with class s, split it where you find a <strong>br</strong>, and for each of those split parts remove the surrounding whitespace and split it again where you find a <strong>=</strong>, so we get a pair of key-value values, remove the whitespace for those pairs as well and put them in a new Hash</q>.</p>
<p>At the end we finish with an array of rows and cells, where certain cells occasionally contain a Hash with the constants used by the row EXIF tag.</p>
<p>It&#8217;s also interesting to note that the first row is unusable, because it corresponds to the th elements, so we&#8217;ll simply do a
<div class="syhi_block"><code>rows.<span style="color:#9900CC;">shift</span></code></div>
<p> and it&#8217;s gone. And to top it all, we could output the <strong>rows</strong> array to a yaml file, so that we do not need to run this each time we need the list of EXIF tags.</p>
<p>Arrays in ruby have a lovely method called <strong>to_yaml</strong> which dutifully generates a version of the array in yaml syntax. And it&#8217;s very easy to output that to a file:</p>
<div class="syhi_block"><code><span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#CC0066; font-weight:bold;">open</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'hexif.yaml'</span>, <span style="color:#996600;">'w'</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>f<span style="color:#006600; font-weight:bold;">|</span><br />
&nbsp; f <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> rows.<span style="color:#9900CC;">to_yaml</span><br />
<span style="color:#006600; font-weight:bold;">&#125;</span></code></div>
<p>And you&#8217;re done! I hope you liked this small Hpricot tutorial/introduction&#8230; and if you have any suggestion or improvement please let me know!</p>
<p>Of course, you can get the complete source code here: <a href="http://github.com/sole/snippets/blob/master/web/scrapping/hexif/hexif.rb">hexif.rb</a>. It is a ridiculous 61 lines, including some commented lines and white spaces. <strong>Come on get it and do something cool!</strong></p>
 <p><a href="http://soledadpenades.com/?flattrss_redirect&amp;id=639&amp;md5=6d28a6e951017cc1989a10df005f649c" title="Flattr" target="_blank"><img src="http://soledadpenades.com/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://soledadpenades.com/2007/06/15/extracting-data-with-hpricot/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>London Ruby Users Group brings you back to uni</title>
		<link>http://soledadpenades.com/2007/03/13/london-ruby-users-group-brings-you-back-to-uni/</link>
		<comments>http://soledadpenades.com/2007/03/13/london-ruby-users-group-brings-you-back-to-uni/#comments</comments>
		<pubDate>Tue, 13 Mar 2007 21:43:08 +0000</pubDate>
		<dc:creator>sole</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[backtracing]]></category>
		<category><![CDATA[continuations]]></category>
		<category><![CDATA[fibonacci]]></category>
		<category><![CDATA[hashes]]></category>
		<category><![CDATA[hpricot]]></category>
		<category><![CDATA[london]]></category>
		<category><![CDATA[lrug]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[ruby on rails]]></category>

		<guid isPermaLink="false">http://www.soledadpenades.com/2007/03/13/london-ruby-users-group-brings-you-back-to-uni/</guid>
		<description><![CDATA[After three failed attempts, I managed to go to yesterday&#8217;s lrug meeting. It was intended to be a kind of experimental collective code review, so people would contribute with pieces of code and get it dissected and improved collectively. There was an special obsession with Hashes, most of the code submissions were improvements and/or workarounds [...]]]></description>
			<content:encoded><![CDATA[<p>After three failed attempts, I managed to go to yesterday&#8217;s <a href="http://lrug.org">lrug</a> meeting. It was intended to be a kind of <strong>experimental collective code review</strong>, so people would contribute with pieces of code and get it dissected and improved collectively. There was an special obsession with <strong>Hashes</strong>, most of the code submissions were improvements and/or workarounds for the Hash class. I understand it. Hashes are cool! The other topic was using <strong>continuations</strong> for (I believe) solving sudokus. Backtracing and fibonacci were also mentioned in the session, and Rob McKinnon made one of his quick presentations, this time proposing a way of getting data from different sources into a generic shareable format (and using upcoming as an specific example, and hpricot and hashes, of course!).</p>
<p>I must say it was pretty interesting, even if I got lost at some points (my ruby knowledge is still too poor). I specially got lost with the continuations stuff, which at the same time brought me back uni memories, of those times in which I skipped some lessons and then went back to the classroom with lots of knowledge gaps and tried to follow the teacher (with no luck, usually). Hehe! But fortunately, this time the <em>teacher</em> was interesting and deserved to be listened to.</p>
<p>This reminded me as well of the beauty of programming and talking about pure concepts and abstractions. It was ages since I felt that, so thanks to all who did it possible. <strong>I think we all need a good dose of abstraction from time to time. Keeps the brain working.</strong></p>
<p>One of the books which was <em>strongly and fervourously</em> recommended is <a href="http://mitpress.mit.edu/sicp/">Structure and Interpretation of Computer Programs</a>, which I believe I read some years ago (again, in the uni :-)). So you can see, ruby is not about rails only!</p>
]]></content:encoded>
			<wfw:commentRss>http://soledadpenades.com/2007/03/13/london-ruby-users-group-brings-you-back-to-uni/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

