<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Reallywow</title>
	<atom:link href="http://blog.reallywow.com/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.reallywow.com</link>
	<description>Really? Wow... That's Reallywow</description>
	<lastBuildDate>Thu, 27 May 2010 15:02:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=abc</generator>
		<item>
		<title>Exploring Astronomy Dataset Links with GridWorks</title>
		<link>http://blog.reallywow.com/archives/135</link>
		<comments>http://blog.reallywow.com/archives/135#comments</comments>
		<pubDate>Thu, 27 May 2010 14:59:46 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[ADS]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=135</guid>
		<description><![CDATA[At ADS we are looking at new ways to index and provide full text searching for the Astronomy and Physics literature we manage to obtain, either through scanning + OCR of historical content, or from digital material provided by some publishers. Two options we&#8217;re looking at are Apache Solr and CDS-Invenio. But that&#8217;s not what [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=135"><!-- &nbsp; --></abbr>
<p>At <a href="http://adsabs.harvard.edu/index.html">ADS</a> we are looking at new ways to index and provide full text searching for the Astronomy and Physics literature we manage to obtain, either through scanning + OCR of historical content, or from digital material provided by some publishers. Two options we&#8217;re looking at are <a href="http://lucene.apache.org/solr/">Apache Solr</a> and <a href="http://cdsware.cern.ch/invenio/index.html">CDS-Invenio</a>. But that&#8217;s not what this post is about.</p>
<p>While parsing and indexing a pile of about 42k articles from the past dozen or so years of the <a href="http://iopscience.iop.org/0004-637X">ApJ</a>, <a href="http://iopscience.iop.org/1538-3881/">AJ</a>, <a href="http://iopscience.iop.org/2041-8205/">ApJL</a> and <a href="http://iopscience.iop.org/0067-0049/">ApJS</a>, formatted in the <a href="http://dtd.nlm.nih.gov/articleauthoring/">NLM XML schema</a>, I noticed that many of the articles contained external links to various things, most interestingly, astronomical datasets.* My first thought was, &#8220;hmm, I wonder what&#8217;s at the other end of all those links&#8230;,&#8221; followed closely by, &#8220;hey, crawling those links would make a nice dataset to load into that nifty new <a href="http://code.google.com/p/freebase-gridworks/">Freebase Gridworks</a> tool I heard about the other day.&#8221; So that&#8217;s what I did.</p>
<p>Out of 13652 articles there were 33600 total links which fell into three categories: http urls (28555), dataset links (938) and supplement links (4107). Dataset links consist of an identifier that looks something like <em>ADS/Sa.CXO#obs/927</em>. To get the goods you have to feed that id to a <a href="http://vo.ads.harvard.edu/dv/">resolver</a> which, assuming a <a href="http://vo.ads.harvard.edu/dv/DataVerifier.cgi">valid</a> identifier, will redirect you to the <a href="http://cda.harvard.edu/chaser/searchOcat.do?instrument=HRC-I,HRC-S,ACIS-I,ACIS-S&amp;grating=NONE,LETG,HETG&amp;status=observed,archived&amp;type=TOO,CAL,GO,GTO,DDT&amp;obsidRangeList=927&amp;radius=10&amp;resolver=simbad-ned&amp;inputCoordFrame=J2000&amp;inputCoordEquinox=2000&amp;outputCoordFrame=J2000&amp;outputCoordEquinox=2000&amp;outputCoordUnits=sexagesimal&amp;sortColumn=seqNum&amp;sortOrder=ascending">real location</a> of the dataset. Supplement links took a bit more head-scratching as their values consisted of just a relative file name, like <em>datafile3.txt</em> or <em>69491.figures.html</em>. We figured out that the solution was to append the filename to the publisher&#8217;s URL for the article, e.g., <a href="http://iopscience.iop.org/0004-637X/659/1/98/">article</a> and <a href="http://iopscience.iop.org/0004-637X/659/1/98/datafile2.txt">dataset</a> or <a href="http://iopscience.iop.org/0004-637X/661/2/845/">article</a> and <a href="http://iopscience.iop.org/0004-637X/661/2/845/70421.figures.html">figures</a>.</p>
<p>The ultimate objective was to load the results of crawling these links into Gridworks, but that means getting the data into csv or tsv form. Rather than have the crawl script output straight to csv, I stash the results in a <a href="http://www.mongodb.org/">MongoDB</a> instance. Here&#8217;s an example of one of the resulting json documents in Mongo:</p>
<div class="codecolorer-container javascript default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:300px;"><div class="javascript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #009900;">&#123;</span>u<span style="color: #3366CC;">'_id'</span><span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'4bfc3737a1f714263b000012'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'anchor_text'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'http://astronomy.swin.edu.au/staff/dforbes/glob.html'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'bibcode'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'2001ApJ...556L..83F'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'content'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'&lt;HTML&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;HEAD&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;TITLE&gt;Duncan A. Forbes, Swinburne University, Globular Clusters&lt;/TITLE&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/HEAD&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;h1&gt; Globular Cluster Research&lt;/h1&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>I am interested in various aspects of Extragalactic Globular<span style="color: #000099; font-weight: bold;">\n</span> &nbsp; &nbsp;Cluster research. In particular the formation and evolution<span style="color: #000099; font-weight: bold;">\n</span> &nbsp; &nbsp;of Globular Cluster Systems and their host galaxies. <span style="color: #000099; font-weight: bold;">\n</span>&lt;br&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A HREF=&quot;colours.html&quot;&gt;GLOBULAR CLUSTER PHOTOMETRY DATABASE&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A HREF=&quot;spectra.html&quot;&gt;GLOBULAR CLUSTER SPECTRAL DATABASE&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A HREF=&quot;review.html&quot;&gt;GLOBULAR CLUSTER REVIEW PAPERS&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A HREF=&quot;http://www.ucolick.org/~brodie/Sages/sages.html&quot;&gt; SAGES PROJECT&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\t</span> &nbsp;HREF=&quot;http://www.physics.mcmaster.ca/resources/fs3_resources.html&quot;&gt; HARRIS DATABASE&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;tr&gt;&lt;td&gt;&lt;hr noshade&gt;&lt;/td&gt;&lt;/tr&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span> &lt;/BODY&gt;<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'context'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'&lt;p&gt;The combined sample data are available at &lt;ext-link ext-link-type=&quot;uri&quot; xlink:href=&quot;http://astronomy.swin.edu.au/staff/dforbes/glob.html&quot;&gt;http://astronomy.swin.edu.au/staff/dforbes/glob.html&lt;/ext-link&gt;. &lt;/p&gt;<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'doi'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'10.1086/323006'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'ft_source'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'/proj/ads/articles/sources/AAS/ApJL/2001/556/2/323006/323006.xml'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'link_id'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'http://astronomy.swin.edu.au/staff/dforbes/glob.html'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'link_type'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'UrlLink'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'response'</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>u<span style="color: #3366CC;">'accept-ranges'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'bytes'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'content-length'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'781'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'content-location'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'http://astronomy.swin.edu.au/~dforbes/glob.html'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'content-type'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'text/html; charset=UTF-8'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'date'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'Tue, 25 May 2010 10:14:07 GMT'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'server'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'Apache/2.2.15 (Unix) DAV/2 mod_ssl/2.2.15 OpenSSL/0.9.8e-fips-rhel5'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'status'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'200'</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'solr_id'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'31908'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'url'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'http://astronomy.swin.edu.au/staff/dforbes/glob.html'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'xpath'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'/html/article/body/sec[5]/fn-group/fn/p/ext-link'</span><span style="color: #009900;">&#125;</span></div></div>
<p>From there it was easy to dump what I needed to csv and load into Gridworks. I&#8217;m not going to get into how totally awesome the Gridworks software is, except to say you should watch the <a href="http://vimeo.com/groups/gridworks">demo videos</a>.</p>
<p>I can&#8217;t post the entire Gridworks project, but here&#8217;s some screencaps, a column list and some of the more interesting facets.</p>
<p style="text-align: center;">
<div id="attachment_147" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks.png"><img class="size-medium wp-image-147 " title="gridworks" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks-300x191.png" alt="" width="300" height="191" /></a><p class="wp-caption-text">Initial data load plus some derived columns</p></div>
<p>Column list:</p>
<ul>
<li>Id of the MongoDB doc</li>
<li>Id of the solr doc</li>
<li>ADS bibcode identifier of the article</li>
<li>Publication year &#8211; derived from the bibcode</li>
<li>DOI</li>
<li>xpath expression of the &lt;ext-link&gt; element</li>
<li>parent tag &#8211; the containing element type</li>
<li>link context &#8211; the containing element&#8217;s serialized xml contents</li>
<li>link type &#8211; one of url, dataset or supplement</li>
<li>anchor text &#8211; the text contents of the &lt;ext-link&gt;</li>
<li>full text source file</li>
<li>journal</li>
<li>full text source &#8211; publisher</li>
<li>extlink id &#8211; either the url or the dataset id or the supplement filename</li>
<li>domain &#8211; derived from the url</li>
<li>status &#8211; http status returned when requesting the resource</li>
<li>content-type &#8211; content-type header returned in the response</li>
<li>mimetype &#8211; derived from the content-type response header</li>
<li>location &#8211; the final url of the resource following any redirects</li>
<li>content length</li>
<li>response headers &#8211; list of all the header attribute names return in the response (just to see what other interesting stuff might be there)</li>
</ul>
<div id="attachment_148" class="wp-caption aligncenter" style="width: 287px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_linktype.png"><img class="size-full wp-image-148 " title="gridworks_linktype" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_linktype.png" alt="" width="277" height="143" /></a><p class="wp-caption-text">Still to be determined how many of the url links point to some kind of data</p></div>
<div id="attachment_149" class="wp-caption aligncenter" style="width: 284px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_parenttag.png"><img class="size-full wp-image-149" title="gridworks_parenttag" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_parenttag.png" alt="" width="274" height="260" /></a><p class="wp-caption-text">Knowing the container could help parsing out something about the semantics of the link</p></div>
<p style="text-align: center;">
<div id="attachment_150" class="wp-caption aligncenter" style="width: 284px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_status.png"><img class="size-full wp-image-150" title="gridworks_status" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_status.png" alt="" width="274" height="370" /></a><p class="wp-caption-text">~70% 200&#39;s was more than I expected. Of course 200 doesn&#39;t mean it actually found something interesting.</p></div>
<div id="attachment_151" class="wp-caption aligncenter" style="width: 283px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_contenttype.png"><img class="size-full wp-image-151" title="gridworks_contenttype" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_contenttype.png" alt="" width="273" height="473" /></a><p class="wp-caption-text">would have hoped for fewer text/html</p></div>
<div id="attachment_152" class="wp-caption aligncenter" style="width: 285px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_domain.png"><img class="size-full wp-image-152  " title="gridworks_domain" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_domain.png" alt="" width="275" height="321" /></a><p class="wp-caption-text">All the gcn.gsfc.nasa.gov hits look like observation reports, like this one, which I think is a good thing</p></div>
<p style="text-align: left;">Finally a thanks to <a href="http://dysinterested.com/">Sean Hannan</a> who worked out <a href="http://gist.github.com/414927">a hack</a> to a bit of the Gridworks javascript that automatically turns any cell values beginning with &#8220;http://&#8221; or &#8220;https://&#8221; into active links. The nice thing about that was it let me turn the column containing the MongoDB id into a link to a little <a href="http://webpy.org">web.py</a> script that dumps a JSON representation of the document.</p>
<p>* NLM allows for links to external resources using either <a href="http://dtd.nlm.nih.gov/articleauthoring/tag-library/2.3/n-ju50.html">&lt;ext-link&gt;</a> or <a href="http://dtd.nlm.nih.gov/articleauthoring/tag-library/2.3/n-2hw0.html">&lt;supplementary-material&gt;</a> elements.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/135/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Embedding citation metadata in the ADS HTML</title>
		<link>http://blog.reallywow.com/archives/123</link>
		<comments>http://blog.reallywow.com/archives/123#comments</comments>
		<pubDate>Mon, 01 Mar 2010 16:08:18 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[ADS]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=123</guid>
		<description><![CDATA[Here&#8217;s what I know: you can embed a set of &#60;meta/&#62; tags containing citation metadata in your HTML to help Google Scholar to index your content. We&#8217;ve been doing it at ADS for quite a while. I&#8217;m not certain if the impetus came directly from Google, or, more likely, we got the idea from a [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=123"><!-- &nbsp; --></abbr>
<p>Here&#8217;s what I know: you can embed a set of &lt;meta/&gt; tags containing citation metadata in your HTML to help Google Scholar to index your content. We&#8217;ve been doing it at <a href="http://ads.harvard.edu">ADS</a> for quite a while. I&#8217;m not certain if the impetus came directly from Google, or, more likely, we got the idea from a <a href="http://www.crossref.org/CrossTech/2008/05/natures_metadata_for_web_pages_1.html">CrossTech blog post</a> by Tony Hammond that describes the technique.</p>
<p>For example, if you execute <code class="codecolorer bash default"><span class="bash">&nbsp;curl <span style="color: #660033;">-s</span> http:<span style="color: #000000; font-weight: bold;">//</span>adsabs.harvard.edu<span style="color: #000000; font-weight: bold;">/</span>abs<span style="color: #000000; font-weight: bold;">/</span>1977NuPhB.126..298A <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">grep</span> meta</span></code> you should see:</p>
<div class="codecolorer-container html4strict default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:300px;"><div class="html4strict codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">...<br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_language&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;en&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_doi&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;10.1016/0550-3213(77)90384-4&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_abstract_html_url&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;http://adsabs.harvard.edu/abs/1977NuPhB.126..298A&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_title&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;Asymptotic freedom in parton language&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_authors&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;Altarelli, G.; Parisi, G.&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_issn&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;0550-3213&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_date&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;08/1977&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_journal_title&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;Nuclear Physics B&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_volume&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;126&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_firstpage&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;298&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_lastpage&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;318&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
...</div></div>
<p>Since first implementation we&#8217;ve had some back-and-forth with Abhishek Jain at Google Scholar to ensure we&#8217;re making use of the full set of fields that Google Scholar looks for.*</p>
<p><a href="http://onebiglibrary.net/">Dan Chudnov</a>, David Bucknum &amp; <a href="http://inkdroid.org/journal/">Ed Summers</a> at the LoC recently expressed interest in also embedding these tags. In the absence of official reference from the Google Scholar folks, I figured it would be a good thing to post here.</p>
<ul>
<li>citation_language</li>
<li>citation_doi</li>
<li>citation_abstract_html_url</li>
<li>citation_title</li>
<li>citation_authors</li>
<li>citation_issn</li>
<li>citation_date</li>
<li>citation_journal_title</li>
<li>citation_volume</li>
<li>citation_firstpage</li>
<li>citation_lastpage</li>
<li>citation_publisher</li>
<li>citation_issue</li>
<li>citation_pdf_url</li>
<li>citation_pmid</li>
<li>citation_keywords (multiple instances OK)</li>
<li>citation_conference</li>
<li>citation_dissertation_name</li>
<li>citation_dissertation_institution</li>
<li>citation_patent_number</li>
<li>citation_patent_country</li>
<li>citation_technical_report_number</li>
<li>citation_technical_report_institution</li>
</ul>
<p>I had to cull this list via a visual scan of a long, forwarded e-mail thread. So, like I tried to insinuate above, it sure would be great if Google Scholar would publish an official reference to this schema somewhere.</p>
<p>* all instances of the term &#8220;we&#8221; should really be read as &#8220;my boss, Alberto&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/123/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Contextual Inquiry on the Cheap</title>
		<link>http://blog.reallywow.com/archives/112</link>
		<comments>http://blog.reallywow.com/archives/112#comments</comments>
		<pubDate>Thu, 31 Dec 2009 15:09:47 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[ADS]]></category>
		<category><![CDATA[UX]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=112</guid>
		<description><![CDATA[I thought I&#8217;d share the interview outline I&#8217;ve been using to conduct some low effort contextual inquiry sessions with ADS users. Classic contextual inquiry, in which the researcher sits with or shadows a person in the context of the subject&#8217;s own working environment, is often conducted in 3+ hour sessions, frequently with all manner of [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=112"><!-- &nbsp; --></abbr>
<p>I thought I&#8217;d share the <a href="http://docs.google.com/View?id=df2kgdvp_272d9mbxrfg">interview outline</a> I&#8217;ve been using to conduct some low effort contextual inquiry sessions with <a href="http://ads.harvard.edu">ADS</a> users.</p>
<div id="attachment_116" class="wp-caption alignright" style="width: 160px"><a href="http://docs.google.com/View?id=df2kgdvp_272d9mbxrfg"><img class="size-thumbnail wp-image-116 " title="interview outline" src="http://blog.reallywow.com/wp-content/uploads/2009/12/interview1-150x150.png" alt="" width="150" height="150" /></a><p class="wp-caption-text">thumbnail links to google doc</p></div>
<p>Classic <a title="Contextual inquiry - Wikipedia, the free encyclopedia" href="http://en.wikipedia.org/wiki/Contextual_inquiry">contextual inquiry</a>, in which the researcher sits with or shadows a person in the context of the subject&#8217;s own working environment, is often conducted in 3+ hour sessions, frequently with all manner of video capturing equipment. My goal is cut that time down to 30 minutes, partly because this whole user research thing is supposed to be a part-time endeavor, and also because the majority of ADS users are PhD&#8217;s, and we all know just <a href="http://www.nytimes.com/2009/09/22/technology/internet/22netflix.html?_r=2&amp;ref=technology&amp;pagewanted=all">how valuable their time is</a>.</p>
<p>So far I&#8217;ve only managed to conduct four of these interviews (with two more scheduled). Would love to get a total of 10. Since I don&#8217;t have access to video equipment I simply mash out typewritten, poorly spelled notes as fast I can. The notes have a stream of consciousness flavor, but the early indications are that the information gathered will be valuable.</p>
<p>Example notes:</p>
<pre style="padding-left: 30px;">refers to bibcode as "indexing thing". "not any use to me."
wrote a perl script that rewrites the bibcode into something understandabl
other strategies for searching for particular star: entering star name into abstract search or title search.
finds one article using abstract search.
mentions that he doesn't know boolean sytnax by memory
to find more tries going to simbad and finds alternate names for the star</pre>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/112/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mad Anachronisms</title>
		<link>http://blog.reallywow.com/archives/104</link>
		<comments>http://blog.reallywow.com/archives/104#comments</comments>
		<pubDate>Wed, 19 Aug 2009 13:18:38 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[TV]]></category>
		<category><![CDATA[environment]]></category>
		<category><![CDATA[madmen]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=104</guid>
		<description><![CDATA[Like the rest of the planet it seems, I&#8217;ve been consumed lately by the show Mad Men. Jennifer and I are still catching up via Netflix. It really is one of those truly great and remarkable shows that comes along too rarely. [Warning: Insignificant spoiler in the next couple of sentences] About midway through the [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=104"><!-- &nbsp; --></abbr>
<p>Like the rest of the planet it seems, I&#8217;ve been consumed lately by the show Mad Men. Jennifer and I are still catching up via Netflix. It really is one of those truly great and remarkable shows that comes along too rarely.</p>
<p><strong>[Warning: Insignificant spoiler in the next couple of sentences]</strong></p>
<p>About midway through the second season we came to the somewhat infamous picnic scene. <strong></strong> The Draper&#8217;s have taken the new Cadillac for a spin in the countryside. They relax and recline on a blanket. The grass is green, the breeze is mild, they talk about how rich they are. All is good. Then it&#8217;s time to pack it up and head home. We see Don stand up, stretch, smile, and chuck his empty beer can into the idyllic landscape as if he was tossing a baseball to his son. Betty pinches two corners of the  blanket and gives it a lift &amp; shake, distributing the paper plates, napkins and other picnic detritus across the grass. The trash begins to lightly flutter and drift down the slope. In 1963 the happy family piles into the car and motors away. Meanwhile, in 2009, we sit on the couch, jaws agape at this stunning spectacle of thoughtless littering.</p>
<p>It&#8217;s suprisingly shocking. There&#8217;s the shock of seeing it, and then there&#8217;s the shock at being so shocked. Every bone in your body wants to be repulsed, but the relativist mindset makes it difficult to fault the characters. As the writer points out in the DVD commentary, <a href="http://en.wikipedia.org/wiki/Keep_America_Beautiful">Iron Eyes Cody</a> didn&#8217;t come along until 1971.</p>
<p>The scene also in a way briefly cracks open that narrative fourth wall in that it&#8217;s clear the writer/director is blatantly highlighting these banal actions to serve up a very in-your-face cultural anachronism. It&#8217;s only been 40 years, but wow have the dominant cultural attitudes about the environment changed.</p>
<p>We&#8217;ve since watched a few more episodes, but that scene still sticks with me, and lately it&#8217;s got me imagining someone sitting on their couch in 2049&#8211;or hovering in their Anti-Grav Lounger, or whatever&#8211;and passing judgment on our present day actions. It&#8217;s interesting to think what might be the contemporary equivalents of folks in the early 60s treating the planet like giant trash receptacle.</p>
<p>I&#8217;m guessing they&#8217;ll look back in horror at us actually throwing things&#8211;<em>anything!</em>&#8211;away in a garbage can rather than somehow recycling or composting.</p>
<p><em>You mean the water from the shower just drains away into the sewer?!?</em></p>
<p><em>They have apples in a Boston supermarket that were grown in New Zealand? Insanity!</em></p>
<p>I mean, can you imagine!?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/104/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>IRC Blocked? Create an SSH tunnel with PuTTY</title>
		<link>http://blog.reallywow.com/archives/83</link>
		<comments>http://blog.reallywow.com/archives/83#comments</comments>
		<pubDate>Fri, 12 Jun 2009 16:19:06 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[hacks]]></category>
		<category><![CDATA[proxy]]></category>
		<category><![CDATA[putty]]></category>
		<category><![CDATA[ssh]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=83</guid>
		<description><![CDATA[Can&#8217;t get to IRC because port 6667 is blocked on your local network? Here are some instructions for how to create a SSH tunnel using  PuTTY, and then connect to freenode (or any other IRC server) with Pidgin using the tunnel as a SOCKS5 proxy. You can most likely s/Pidgin/your IRC client of choice/, but [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=83"><!-- &nbsp; --></abbr>
<p>Can&#8217;t get to IRC because port 6667 is blocked on your local network? Here are some instructions for how to create a SSH tunnel using  <a href="http://www.putty.org/">PuTTY</a>, and then connect to freenode (or any other IRC server) with <a href="http://www.pidgin.im/">Pidgin</a> using the tunnel as a SOCKS5 proxy. You can most likely <em>s/Pidgin/your IRC client of choice/</em>, but the screenshots below will show the Pidgin config dialogs. These instructions also assume that SSH, port 22, is not also blocked. Woe be to you if that is the case.</p>
<p>My original source for how to do this was <a href="http://everythinghurts.com/ssh-tunnelling/">this post</a>, which describes the same trick but for FireFox.</p>
<p><strong>Step 1</strong>: create a new PuTTY session configuration. In this case I using the login <strong>me@ssh.example.com</strong> and calling the session <strong>irc-7777</strong>. I usually name the session based on the local port number I&#8217;m going to forward.</p>
<div id="attachment_93" class="wp-caption aligncenter" style="width: 466px"><img class="size-full wp-image-93" title="putty1" src="http://blog.reallywow.com/wp-content/uploads/2009/06/putty11.png" alt="create and save a new PuTTY session " width="456" height="435" /><p class="wp-caption-text">create and save a new PuTTY session </p></div>
<p><strong>Step 2</strong>: go to the <strong>Connection -&gt; SSH -&gt; Tunnels</strong> node of the session config. In the Source port field enter &#8220;7777&#8243; (or some other port number). In the radio button section below that select <strong>Dynamic</strong> and <strong>Auto</strong>. Click the <strong>Add</strong> button. You should see &#8220;D7777&#8243; appear in the list of forwarded ports.</p>
<div id="attachment_94" class="wp-caption aligncenter" style="width: 466px"><img class="size-full wp-image-94" title="putty2" src="http://blog.reallywow.com/wp-content/uploads/2009/06/putty21.png" alt="Configure the ssh tunnel " width="456" height="435" /><p class="wp-caption-text">Configure the ssh tunnel </p></div>
<div id="attachment_95" class="wp-caption aligncenter" style="width: 466px"><img class="size-full wp-image-95" title="putty3" src="http://blog.reallywow.com/wp-content/uploads/2009/06/putty31.png" alt="Tunnel D7777 appears in the list" width="456" height="435" /><p class="wp-caption-text">Tunnel D7777 appears in the list</p></div>
<p><strong>Step 3</strong>: go back to the man session config node and save the session again. Then open the PuTTY session by clicking the <strong>Open</strong> button. A normal looking PuTTY terminal window should open. This session is your tunnel so you should probably leave it be. i.e., don&#8217;t use it for doing stuff in the shell and as a tunnel (although I don&#8217;t really know what consequences that would lead to). If the fact that this tunnel takes up space in your TaskBar (it does me) check out <a href="http://haanstra.eu/putty/">PuTTY-Tray</a>.</p>
<div id="attachment_97" class="wp-caption aligncenter" style="width: 310px"><img class="size-medium wp-image-97" title="putty4" src="http://blog.reallywow.com/wp-content/uploads/2009/06/putty4-300x150.png" alt="You've probably never seen one of these before" width="300" height="150" /><p class="wp-caption-text">You&#39;ve probably never seen one of these before</p></div>
<p><strong>Step 4</strong>: Configure your Pidgin IRC account to use the tunnel as a SOCKS5 proxy. Go to Accounts -&gt; Manage Accounts. Highlight your IRC protocol account and click Modify (or create one by clicking Add). Go to the Advanced tab of the config dialog. In the Proxy Options section select SOCKS5 as the proxy type, enter &#8220;localhost&#8221; as the Host and &#8220;7777&#8243; (or whatever port you used) as the Port.</p>
<div id="attachment_96" class="wp-caption aligncenter" style="width: 314px"><img class="size-full wp-image-96" title="pidgin" src="http://blog.reallywow.com/wp-content/uploads/2009/06/pidgin1.png" alt="Specify your local tunnel as the SOCKS5 proxy" width="304" height="520" /><p class="wp-caption-text">Specify your local tunnel as the SOCKS5 proxy</p></div>
<p>Save and that&#8217;s it. You should be able to connect to IRC server through Pidgin.</p>
<p>On a *nix machine (or using Cygwin, I suppose) this is, of course, much simpler. You can replace the PuTTY steps with a single ssh command: <em>ssh -D localhost:7777 me@ssh.example.com</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/83/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>gtfo: get the foaf out</title>
		<link>http://blog.reallywow.com/archives/74</link>
		<comments>http://blog.reallywow.com/archives/74#comments</comments>
		<pubDate>Thu, 23 Apr 2009 23:56:11 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[code4lib]]></category>
		<category><![CDATA[foaf]]></category>
		<category><![CDATA[bcb4]]></category>
		<category><![CDATA[linkeddata]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=74</guid>
		<description><![CDATA[Leading up to the Linked Data pre-conf at cod4lib09 there were several irc discussions around just how to structure the day and what could we do to give attendees the best shot at having an &#8220;ah-ha moment&#8221;. One idea was to create a simple application that would demonstrate the potential of linked data while also [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=74"><!-- &nbsp; --></abbr>
<p>Leading up to the <a href="http://wiki.code4lib.org/index.php/LinkedData">Linked Data pre-conf</a> at <a href="http://code4lib.org/conference/2009/">cod4lib09</a> there were several <a href="http://code4lib.org/irc">irc</a> discussions around just how to structure the day and what could we do to give attendees the best shot at having an &#8220;ah-ha moment&#8221;. One idea was to create a simple application that would demonstrate the potential of linked data while also being participatory. I think it took <a href="http://inkdroid.org">Ed Summers</a> less than 24 hours to hack together the first iteration of the cod4lib2009 attendees foaf crawler/gallery <a href="http://inkdroid.org/c4l2009/attendees">thingy</a>. He can usually be counted on for such feats of overnight engineering.</p>
<p><span id="more-74"></span>By the day of the pre-conf we already had a gallery of 20+ foaf profiles. I spent part of the afternoon that day trying to guide several folks through the process of creating a foaf file and getting linked into our new corner of the web of data (with varying degrees of success). Along the way several people added enhancements, bugs were identified and sometimes fixed, much was learned re: linked data, vocabs, <a href="http://www.w3.org/2001/tag/issues.html#httpRange-14">hash fragments vs. 303s</a>, etc. <a href="http://xplus3.net/">Jonathan Brinley</a> and <a href="http://twitter.com/mbklein">Michael Klein</a> even wrote a companion <a href="http://svn.breaksalot.org/supybot-plugins/plugins/FOAF/">Supybot plugin</a> for our irc bot, <a href="http://www.code4lib.org/id/zoia">zoia</a>. It was fun.</p>
<p>Since then my mind has come back to it now and then, mulling over how simple it would be to push some of the variables into a configuration file and make the crawler + gallery re-usable for other events, for example, the soon-to-be-happening <a href="http://www.barcampboston.org/">BarCampBoston 4</a>. This turned out to be relatively easy, and pretty quickly I had an <a href="http://reallywow.com/bcb4/attendees">demo attendees page</a> for bcb4. Some fellow bcb4 goers were nice enough to participate by them asserting their attendence in their FOAF files and me pretending to know them. I&#8217;ve decided not to try leading any sessions on linked data or foaf at bcb4, mostly because it&#8217;s my first time and I want to soak up what other folks are into, but maybe it&#8217;ll make for a good ice-breaker or something.</p>
<p>Here&#8217;s my problem though: I&#8217;m starting to second guess myself as to the level of general usefulness of this thing&#8211;which, btw, I have christened <strong>gtfo</strong>, aka &#8220;gitfo&#8221;, aka &#8220;get the foaf out&#8221;&#8211;outside of the context of a tutorial/demo. I mean, could it really even function as an actual conference event attendee gallery? There&#8217;s the whole issue of folks needing to foaf:knows each other. I feel like there&#8217;s the seed of something cool in there, but it may need a substantial rethink and refactor to get beyond being the equivalent of the pet store shopping cart of linked data apps.</p>
<p>I have a few ideas&#8230; like allowing the criteria for inclusion to be configurable as something other than attendance at a specified event&#8230; or allowing the crawler to traverse a network of people connected via <a href="http://trac.usefulinc.com/doap">DOAP</a> assertions rather than foaf:knows&#8230; or maybe allowing for galleries generated dynamically based on user input rather than a passive crawler, e.g., 1) choose seed node, 2) choose link relationship, 3) choose inclusion criteria, 4) generate gallery.</p>
<p>Anyway, if you&#8217;re interested in contributing ideas and or code, I&#8217;ve pushed my fork of the original app to Github: <a href="http://github.com/lbjay/gtfo/tree/master">http://github.com/lbjay/gtfo/tree/master</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/74/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Basic Block Data Decomposition in Perl</title>
		<link>http://blog.reallywow.com/archives/65</link>
		<comments>http://blog.reallywow.com/archives/65#comments</comments>
		<pubDate>Tue, 03 Mar 2009 21:28:19 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[parallel]]></category>
		<category><![CDATA[perl parallel threads]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=65</guid>
		<description><![CDATA[I was playing around with the idea of parallelizing something the other day to eke out some performance. Unfortunately, I&#8217;ve gotten a bit rusty since writing some MPI code for a parallel computing course a few years back. I got stuck on what should be the simple part of dividing up my input across the [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=65"><!-- &nbsp; --></abbr>
<p>I was playing around with the idea of parallelizing something the other day to eke out some performance. Unfortunately, I&#8217;ve gotten a bit rusty since writing some MPI code for a parallel computing course a few years back. I got stuck on what should be the simple part of dividing up my input across the threads.</p>
<p><span id="more-65"></span>The goal is to divvy things up into continguous blocks of roughly equal size. i.e., if the size of your input is 38 (<em>n</em>) and you start four threads (<em>p</em>) you don&#8217;t want to give the first three threads chunks of 12 and the last thread gets 2. You want slices of 10, 9, 10 and 9.</p>
<p>So I flailed away with loops and the POSIX::floor for little awhile and came pretty close to what I remembered. I had to finally drag out my <a href="http://books.google.com/books?id=tDxNyGSXg5IC">textbook</a> (and translate from the C Macros) to get it right.</p>
<pre lang="perl">#!/usr/bin/perl

# Block Data Decomposition:
# Divide array n into p contiguous blocks of roughly equal size

use POSIX qw(floor);
use strict;

sub block_start {
    my ($i, $p, $n) = @_;
    return floor(($i * $n) / $p);
}

sub block_end {
    my ($i, $p, $n) = @_;
    return (block_start($i + 1, $p, $n) - 1);
}

my @input = get_input();
my $n = scalar @input;
my $p = 4;

for my $i (0..$p-1)
{
    my $start = block_start($i, $p, $n);
    my $end = block_end($i, $p, $n);
    my @range = @input[$start..$end];
    do_something(\@range);
}</pre>
<p>The idea is that</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">do_something(\@range)</div></div>
<p>sends a slice of input off for processing by one of your threads. A pretty useful algorithm when doing this sort of thing. Certainly not rocket science. Which is why we should all be happy I&#8217;m not a rocket scientist.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/65/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Code4LibCon 2009: Timeline and IRC log</title>
		<link>http://blog.reallywow.com/archives/45</link>
		<comments>http://blog.reallywow.com/archives/45#comments</comments>
		<pubDate>Sat, 28 Feb 2009 14:46:55 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[code4lib]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[irc]]></category>
		<category><![CDATA[c4l09]]></category>
		<category><![CDATA[code4lib2009]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[timeline]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=45</guid>
		<description><![CDATA[To cut to the chase, I extracted the hCal events from the 2009 conference schedule and fed them into a Simile Timeline. I then linked each event to the corresponding slice of my IRC client log. If you want to take a look it&#8217;s here. I don&#8217;t remember what initially sent me there, but my [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=45"><!-- &nbsp; --></abbr>
<p>To cut to the chase, I extracted the hCal events from the 2009 conference schedule and fed them into a <a href="http://code.google.com/p/simile-widgets/">Simile Timeline</a>. I then linked each event to the corresponding slice of my IRC client log. If you want to take a look it&#8217;s <a href="http://reallywow.com/c4l09/timeline">here</a>.</p>
<p><span id="more-45"></span>I don&#8217;t remember what initially sent me there, but my introduction to Code4Lib was through the IRC channel. I&#8217;ve been logged in there off and on ever since. It keeps me informed and entertained and, yes, occasionally distracted. I&#8217;ve since attended all four of the yearly conferences, met and meatspace-friended a good percentage of the #code4lib regulars, contributed a patch here and there to a couple of projects, and helped organize a <a href="http://wiki.code4lib.org/index.php/NEC4L">one-day regional gathering</a>. I guess you could say at this point that I&#8217;m pretty fond of the whole thing.</p>
<p>This is why I take it somewhat personally when the annual hand-wringing debate begins over the perceived &#8220;cliquishness&#8221; of the community. There was much fuss this year&#8211;an awkward amount, even, IMO&#8211;over 1st-timers vs. old-timers. I make it a point to try and sit with people I don&#8217;t know during the lunches and shake a few hands. Lots of folks do the same for dinner.  Basically, IMO, if you feel like an outcast n00b, YOU&#8217;RE NOT TRYING HARD ENOUGH.</p>
<p>There is, however&#8211;and maybe I have a bit of &#8220;I haz a straw man. Let me show u it&#8221; going on here, but anyway&#8211;an aspect of the <em>&#8220;code4lib is just a big fat secretive, juvenille high school-ish in crowd&#8221;</em> argument where I think we majorly fail, and that is the non-open nature of the backchannel.</p>
<p>I met a lot of awesome new people over the past few days attending the 4th code4lib conference in Providence, RI, Jon Phipps of the NSDL MetaData Registry. I was a little suprised to read <a href="http://managemetadata.org/blog/2009/02/25/embrace-the-chaos/">he didn&#8217;t enjoy the program</a>, but that&#8217;s cool. I give him big props for calling it as he sees it. The part that got under my skin, because he&#8217;s totally right, was his mention of <em>&#8220;the hugely active IRC back channel of ongoing commentary (which really should be displayed where everyone including the presenters can read it)&#8221;</em>. This simply rang true to me.</p>
<p>Let me first say what I&#8217;m not saying: <strong>I do not think</strong><strong> </strong>it&#8217;s rude and unfair that a bunch of us (100+, depending if you count those not physically present at the conf) are carrying on a parallel conversation while the presenters we have invited are getting up on stage and sharing projects and ideas that they care deeply about and have slaved over. This is the nature of our beast. To paraphrase something BillDeuber said in channel yesterday, is the channel and extension of the conf, or vice versa? I think the latter.</p>
<p>But <strong>I do think</strong> it&#8217;s rude and unfair that we are carrying on an <strong>un-open</strong> and <strong>inaccessible </strong>parallel conversation while the presenters <strong>we have invited</strong> are getting up on stage and sharing projects and ideas that they care deeply about and have slaved over.</p>
<p>The funny thing is, this reasoning is not what first prompted me to put my chat log up on the web. I did it because Corey Harper asked if I&#8217;d email him the section from when he was presenting at the linked data preconf and I figured others might like the same courtesy. I also thought Timeline would be a cool project to experiment with. I&#8217;ve since had a few conversations about whether or not it&#8217;s fair to the people in-channel who maybe didn&#8217;t realize what they were saying was going to be published later on. But if someone is accusing you of being cliquish and secretive, how is the proper response not to be more open and transparent? To say, &#8220;Here, take a look. See for yourself.&#8221;</p>
<p>This is probably making it seem like there must be some really juicy shit going on in the back channel, but that&#8217;s exactly the thing: there&#8217;s really not. It never gets any snarkier than your typical <a href="http://en.wikipedia.org/wiki/Mystery_Science_Theater_3000">MST3K</a> episode. And would anyone argue that Joel, Mike, Crow and Tom Servo didn&#8217;t really, deep down, love those old, bad movies they were forced to watch?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/45/feed</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>My &#8220;unified&#8221; twitter + identi.ca client</title>
		<link>http://blog.reallywow.com/archives/22</link>
		<comments>http://blog.reallywow.com/archives/22#comments</comments>
		<pubDate>Tue, 13 Jan 2009 18:21:15 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[identi.ca]]></category>
		<category><![CDATA[microblogging]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=22</guid>
		<description><![CDATA[I've been on the lookout for a client that would somehow unify my streams from both Twitter &#038; Identi.ca. There's several clients that will support one-or-the-other, but not both simultaneously. My solution doesn't technically do this either but it scratches my itch to at least have them in the same window.]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=22"><!-- &nbsp; --></abbr>
<p>I&#8217;ve been on the lookout for a client that would somehow unify my streams from both <a href="http://twitter.com/lbjay">Twitter</a> &amp; <a href="http://identi.ca/lbjay">Identi.ca</a>. There&#8217;s a few clients that will support one-or-the-other (sometimes via <a href="http://hg.mozilla-hispano.org/uncryptic/identifox/overview/">hacking the source</a>) but I&#8217;ve found none that scratch my itch of have both services presented in the same window. My taskbar is precious real-estate.</p>
<p>These instructions reflect my setup on an Ubuntu vhost which I ssh into via PuTTY.</p>
<p>Required:</p>
<ul>
<li>Twitter account</li>
<li>Identi.ca account</li>
<li><a href="http://www.floodgap.com/software/ttytter">TTYtter</a>, a perl command-line client for accessing Twitter-compatible APIs</li>
<li>Gnu Screen</li>
</ul>
<p>Step 1, download TTYtter, make it executable and put it somewhere on your $PATH.</p>
<p>Step 2, create a file at ~/.ttytterrc1 with the following contents, including your identi.ca login:</p>
<pre><span style="color: #000080;">  url=http://identi.ca/api/statuses/friends_timeline.json
  rurl=http://identi.ca/api/statuses/replies.json
  uurl=http://identi.ca/api/statuses/user_timeline
  wurl=http://identi.ca/api/users/show
  update=http://identi.ca/api/statuses/update.json
  dmurl=http://identi.ca/api/direct_messages.json
  frurl=http://identi.ca/api/friendships/exists.json
  user=[username]:[pass]
</span></pre>
<p>Step 3, create another file at ~/.ttytterc2 with just your twitter login:</p>
<pre>  <span style="color: #000080;">user=[username]:[pass]</span></pre>
<p>Step 4, create a dedicated screen session with the command <strong><span style="color: #000080;">screen -S ttytter</span></strong></p>
<p>Step 5, split the screen horizontally using the screen command <strong><span style="color: #000080;">Ctrl+a S</span></strong></p>
<p>Step 6, start your Identi.ca session in the top window with the command <span style="color: #000080;"><strong>ttytter.pl -rc=1</strong></span></p>
<p>Step 7, switch to the bottom pane with the screen command <span style="color: #000080;"><strong>Ctrl+a TAB</strong></span></p>
<p>Step 8, create a new window in the bottom pane with the screen command <strong><span style="color: #000080;">Ctrl+a c</span></strong></p>
<p>Step 9, start the Twitter session in the bottom pane with the command <span style="color: #000080;"><strong>ttytter.pl -rc=2</strong></span></p>
<div id="attachment_25" class="wp-caption alignnone" style="width: 310px"><a href="http://blog.reallywow.com/wp-content/uploads/2009/01/ttytter.png"><img class="size-medium wp-image-25" title="ttytter" src="http://blog.reallywow.com/wp-content/uploads/2009/01/ttytter-300x267.png" alt="Example screenshot" width="300" height="267" /></a><p class="wp-caption-text">Example screenshot</p></div>
<p>Only set this up 2 hours ago but already I can tell this is going to work for me long term. Beats having to do hard resets due to some combination of <a href="http://www.twhirl.org/">Twhirl</a> and/or my video drivers. One improvement I&#8217;d like to investigate is some kind of color highlighting of the &lt;username&gt;&#8217;s to improve readability.</p>
<p><em>Update: January 13, 2009 at 11:30</em></p>
<p>So I quickly realized that the above setup has a major malfunction, namely that it doesn&#8217;t allow preserving of the split window so it&#8217;s not possible to detach and reattach to the ttytter screen session. So there&#8217;s an extra trick necessary to do this:</p>
<p>Step 3.5, create an &#8220;outer&#8221; screen session that wraps the ttytter session with the command <span style="color: #000080;"><strong>screen -e^Ee -S outer</strong></span>.</p>
<p>The <strong>-e^Ee</strong> option binds the escape key for that session to Ctrl+e instead of the default Ctrl+a. This is a common trick for embedding screen sessions within sessions. With this extra step I can now detach from the outer session using <span style="color: #000080;"><strong>Ctrl+e d</strong></span> and reattach with the inner split-screen session presevered using <strong><span style="color: #000080;">screen -r outer</span></strong>.</p>
<p>Also, I learned that to colorize the output you can use the -ansi option to ttytter.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/22/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Semweb Gang talks about Glue</title>
		<link>http://blog.reallywow.com/archives/12</link>
		<comments>http://blog.reallywow.com/archives/12#comments</comments>
		<pubDate>Thu, 18 Dec 2008 22:09:16 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[Semantic Web]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=12</guid>
		<description><![CDATA[Interesting conversation this month. (This is the stuff I listen to on my commute.) I was particularly intrigued by the 10 or so minutes spent discussing the need for a method of embedding identifiers and the location of a web service into HTML. Send the identifier to the service and get back the metadata. This [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=12"><!-- &nbsp; --></abbr>
<p>Interesting conversation <a href="http://semanticgang.talis.com/2008/12/08/novemberdecember-2008-the-semantic-web-gang-discusses-glue-and-looks-back-at-2008/">this month</a>. (This is the stuff I listen to on my commute.) I was particularly intrigued by the 10 or so minutes spent discussing the need for a method of embedding identifiers and the location of a web service into HTML. Send the identifier to the service and get back the metadata. This is the exact use case of <a href="http://unapi.info">unAPI</a>.</p>
<p>I was all set to get to work and give them a <em>brrring!</em> on the cluephone, a.k.a., comment on the post. But before I got around to it <a href="http://inkdroid.org">Ed Summers</a> pointed out on irc that you can achieve the same thing using just &lt;link&gt; elements and/or <a href="http://tools.ietf.org/id/draft-nottingham-http-link-header-03.txt">HTTP Link: headers</a>. In other words, why separate the identifiers from the service URI.</p>
<p>I like how simple unAPI is to implement. Since your metadata service&#8217;s base url doesn&#8217;t change you don&#8217;t have to worry about coordinating attributes of elements that need to appear in both your &lt;head&gt; and page &lt;body&gt;. This is a non-issue for lots of folks, but I bet it&#8217;s not so simple if your using WordPress or Drupal for your CMS.</p>
<p>As for the <a href="http://getglue.com">Glue extension thingie</a>, I&#8217;ll try it out before passing judgement. But it did strike me funny that they&#8217;re not using RDF for anything. Also, and maybe I&#8217;m imagining it, but in the 10-minute wrapup at the end of the podcast I <em>think</em> Tom Heath basically takes a some veiled jabs at the Glue guys for being SemWeb poseurs.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/12/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
