<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Reallywow &#187; ADS</title>
	<atom:link href="http://blog.reallywow.com/archives/category/ads/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.reallywow.com</link>
	<description>Really? Wow... That's Reallywow</description>
	<lastBuildDate>Thu, 27 May 2010 15:02:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=abc</generator>
		<item>
		<title>Exploring Astronomy Dataset Links with GridWorks</title>
		<link>http://blog.reallywow.com/archives/135</link>
		<comments>http://blog.reallywow.com/archives/135#comments</comments>
		<pubDate>Thu, 27 May 2010 14:59:46 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[ADS]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=135</guid>
		<description><![CDATA[At ADS we are looking at new ways to index and provide full text searching for the Astronomy and Physics literature we manage to obtain, either through scanning + OCR of historical content, or from digital material provided by some publishers. Two options we&#8217;re looking at are Apache Solr and CDS-Invenio. But that&#8217;s not what [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=135"><!-- &nbsp; --></abbr>
<p>At <a href="http://adsabs.harvard.edu/index.html">ADS</a> we are looking at new ways to index and provide full text searching for the Astronomy and Physics literature we manage to obtain, either through scanning + OCR of historical content, or from digital material provided by some publishers. Two options we&#8217;re looking at are <a href="http://lucene.apache.org/solr/">Apache Solr</a> and <a href="http://cdsware.cern.ch/invenio/index.html">CDS-Invenio</a>. But that&#8217;s not what this post is about.</p>
<p>While parsing and indexing a pile of about 42k articles from the past dozen or so years of the <a href="http://iopscience.iop.org/0004-637X">ApJ</a>, <a href="http://iopscience.iop.org/1538-3881/">AJ</a>, <a href="http://iopscience.iop.org/2041-8205/">ApJL</a> and <a href="http://iopscience.iop.org/0067-0049/">ApJS</a>, formatted in the <a href="http://dtd.nlm.nih.gov/articleauthoring/">NLM XML schema</a>, I noticed that many of the articles contained external links to various things, most interestingly, astronomical datasets.* My first thought was, &#8220;hmm, I wonder what&#8217;s at the other end of all those links&#8230;,&#8221; followed closely by, &#8220;hey, crawling those links would make a nice dataset to load into that nifty new <a href="http://code.google.com/p/freebase-gridworks/">Freebase Gridworks</a> tool I heard about the other day.&#8221; So that&#8217;s what I did.</p>
<p>Out of 13652 articles there were 33600 total links which fell into three categories: http urls (28555), dataset links (938) and supplement links (4107). Dataset links consist of an identifier that looks something like <em>ADS/Sa.CXO#obs/927</em>. To get the goods you have to feed that id to a <a href="http://vo.ads.harvard.edu/dv/">resolver</a> which, assuming a <a href="http://vo.ads.harvard.edu/dv/DataVerifier.cgi">valid</a> identifier, will redirect you to the <a href="http://cda.harvard.edu/chaser/searchOcat.do?instrument=HRC-I,HRC-S,ACIS-I,ACIS-S&amp;grating=NONE,LETG,HETG&amp;status=observed,archived&amp;type=TOO,CAL,GO,GTO,DDT&amp;obsidRangeList=927&amp;radius=10&amp;resolver=simbad-ned&amp;inputCoordFrame=J2000&amp;inputCoordEquinox=2000&amp;outputCoordFrame=J2000&amp;outputCoordEquinox=2000&amp;outputCoordUnits=sexagesimal&amp;sortColumn=seqNum&amp;sortOrder=ascending">real location</a> of the dataset. Supplement links took a bit more head-scratching as their values consisted of just a relative file name, like <em>datafile3.txt</em> or <em>69491.figures.html</em>. We figured out that the solution was to append the filename to the publisher&#8217;s URL for the article, e.g., <a href="http://iopscience.iop.org/0004-637X/659/1/98/">article</a> and <a href="http://iopscience.iop.org/0004-637X/659/1/98/datafile2.txt">dataset</a> or <a href="http://iopscience.iop.org/0004-637X/661/2/845/">article</a> and <a href="http://iopscience.iop.org/0004-637X/661/2/845/70421.figures.html">figures</a>.</p>
<p>The ultimate objective was to load the results of crawling these links into Gridworks, but that means getting the data into csv or tsv form. Rather than have the crawl script output straight to csv, I stash the results in a <a href="http://www.mongodb.org/">MongoDB</a> instance. Here&#8217;s an example of one of the resulting json documents in Mongo:</p>
<div class="codecolorer-container javascript default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:300px;"><div class="javascript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #009900;">&#123;</span>u<span style="color: #3366CC;">'_id'</span><span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'4bfc3737a1f714263b000012'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'anchor_text'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'http://astronomy.swin.edu.au/staff/dforbes/glob.html'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'bibcode'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'2001ApJ...556L..83F'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'content'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'&lt;HTML&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;HEAD&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;TITLE&gt;Duncan A. Forbes, Swinburne University, Globular Clusters&lt;/TITLE&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/HEAD&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;h1&gt; Globular Cluster Research&lt;/h1&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>I am interested in various aspects of Extragalactic Globular<span style="color: #000099; font-weight: bold;">\n</span> &nbsp; &nbsp;Cluster research. In particular the formation and evolution<span style="color: #000099; font-weight: bold;">\n</span> &nbsp; &nbsp;of Globular Cluster Systems and their host galaxies. <span style="color: #000099; font-weight: bold;">\n</span>&lt;br&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A HREF=&quot;colours.html&quot;&gt;GLOBULAR CLUSTER PHOTOMETRY DATABASE&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A HREF=&quot;spectra.html&quot;&gt;GLOBULAR CLUSTER SPECTRAL DATABASE&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A HREF=&quot;review.html&quot;&gt;GLOBULAR CLUSTER REVIEW PAPERS&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A HREF=&quot;http://www.ucolick.org/~brodie/Sages/sages.html&quot;&gt; SAGES PROJECT&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;UL&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;A<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\t</span> &nbsp;HREF=&quot;http://www.physics.mcmaster.ca/resources/fs3_resources.html&quot;&gt; HARRIS DATABASE&lt;/A&gt;<span style="color: #000099; font-weight: bold;">\n</span>&lt;/UL&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&lt;tr&gt;&lt;td&gt;&lt;hr noshade&gt;&lt;/td&gt;&lt;/tr&gt;<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span> &lt;/BODY&gt;<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'context'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'&lt;p&gt;The combined sample data are available at &lt;ext-link ext-link-type=&quot;uri&quot; xlink:href=&quot;http://astronomy.swin.edu.au/staff/dforbes/glob.html&quot;&gt;http://astronomy.swin.edu.au/staff/dforbes/glob.html&lt;/ext-link&gt;. &lt;/p&gt;<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'doi'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'10.1086/323006'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'ft_source'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'/proj/ads/articles/sources/AAS/ApJL/2001/556/2/323006/323006.xml'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'link_id'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'http://astronomy.swin.edu.au/staff/dforbes/glob.html'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'link_type'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'UrlLink'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'response'</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>u<span style="color: #3366CC;">'accept-ranges'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'bytes'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'content-length'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'781'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'content-location'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'http://astronomy.swin.edu.au/~dforbes/glob.html'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'content-type'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'text/html; charset=UTF-8'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'date'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'Tue, 25 May 2010 10:14:07 GMT'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'server'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'Apache/2.2.15 (Unix) DAV/2 mod_ssl/2.2.15 OpenSSL/0.9.8e-fips-rhel5'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;u<span style="color: #3366CC;">'status'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'200'</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'solr_id'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'31908'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'url'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'http://astronomy.swin.edu.au/staff/dforbes/glob.html'</span><span style="color: #339933;">,</span><br />
&nbsp;u<span style="color: #3366CC;">'xpath'</span><span style="color: #339933;">:</span> u<span style="color: #3366CC;">'/html/article/body/sec[5]/fn-group/fn/p/ext-link'</span><span style="color: #009900;">&#125;</span></div></div>
<p>From there it was easy to dump what I needed to csv and load into Gridworks. I&#8217;m not going to get into how totally awesome the Gridworks software is, except to say you should watch the <a href="http://vimeo.com/groups/gridworks">demo videos</a>.</p>
<p>I can&#8217;t post the entire Gridworks project, but here&#8217;s some screencaps, a column list and some of the more interesting facets.</p>
<p style="text-align: center;">
<div id="attachment_147" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks.png"><img class="size-medium wp-image-147 " title="gridworks" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks-300x191.png" alt="" width="300" height="191" /></a><p class="wp-caption-text">Initial data load plus some derived columns</p></div>
<p>Column list:</p>
<ul>
<li>Id of the MongoDB doc</li>
<li>Id of the solr doc</li>
<li>ADS bibcode identifier of the article</li>
<li>Publication year &#8211; derived from the bibcode</li>
<li>DOI</li>
<li>xpath expression of the &lt;ext-link&gt; element</li>
<li>parent tag &#8211; the containing element type</li>
<li>link context &#8211; the containing element&#8217;s serialized xml contents</li>
<li>link type &#8211; one of url, dataset or supplement</li>
<li>anchor text &#8211; the text contents of the &lt;ext-link&gt;</li>
<li>full text source file</li>
<li>journal</li>
<li>full text source &#8211; publisher</li>
<li>extlink id &#8211; either the url or the dataset id or the supplement filename</li>
<li>domain &#8211; derived from the url</li>
<li>status &#8211; http status returned when requesting the resource</li>
<li>content-type &#8211; content-type header returned in the response</li>
<li>mimetype &#8211; derived from the content-type response header</li>
<li>location &#8211; the final url of the resource following any redirects</li>
<li>content length</li>
<li>response headers &#8211; list of all the header attribute names return in the response (just to see what other interesting stuff might be there)</li>
</ul>
<div id="attachment_148" class="wp-caption aligncenter" style="width: 287px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_linktype.png"><img class="size-full wp-image-148 " title="gridworks_linktype" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_linktype.png" alt="" width="277" height="143" /></a><p class="wp-caption-text">Still to be determined how many of the url links point to some kind of data</p></div>
<div id="attachment_149" class="wp-caption aligncenter" style="width: 284px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_parenttag.png"><img class="size-full wp-image-149" title="gridworks_parenttag" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_parenttag.png" alt="" width="274" height="260" /></a><p class="wp-caption-text">Knowing the container could help parsing out something about the semantics of the link</p></div>
<p style="text-align: center;">
<div id="attachment_150" class="wp-caption aligncenter" style="width: 284px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_status.png"><img class="size-full wp-image-150" title="gridworks_status" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_status.png" alt="" width="274" height="370" /></a><p class="wp-caption-text">~70% 200&#39;s was more than I expected. Of course 200 doesn&#39;t mean it actually found something interesting.</p></div>
<div id="attachment_151" class="wp-caption aligncenter" style="width: 283px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_contenttype.png"><img class="size-full wp-image-151" title="gridworks_contenttype" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_contenttype.png" alt="" width="273" height="473" /></a><p class="wp-caption-text">would have hoped for fewer text/html</p></div>
<div id="attachment_152" class="wp-caption aligncenter" style="width: 285px"><a href="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_domain.png"><img class="size-full wp-image-152  " title="gridworks_domain" src="http://blog.reallywow.com/wp-content/uploads/2010/05/gridworks_domain.png" alt="" width="275" height="321" /></a><p class="wp-caption-text">All the gcn.gsfc.nasa.gov hits look like observation reports, like this one, which I think is a good thing</p></div>
<p style="text-align: left;">Finally a thanks to <a href="http://dysinterested.com/">Sean Hannan</a> who worked out <a href="http://gist.github.com/414927">a hack</a> to a bit of the Gridworks javascript that automatically turns any cell values beginning with &#8220;http://&#8221; or &#8220;https://&#8221; into active links. The nice thing about that was it let me turn the column containing the MongoDB id into a link to a little <a href="http://webpy.org">web.py</a> script that dumps a JSON representation of the document.</p>
<p>* NLM allows for links to external resources using either <a href="http://dtd.nlm.nih.gov/articleauthoring/tag-library/2.3/n-ju50.html">&lt;ext-link&gt;</a> or <a href="http://dtd.nlm.nih.gov/articleauthoring/tag-library/2.3/n-2hw0.html">&lt;supplementary-material&gt;</a> elements.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/135/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Embedding citation metadata in the ADS HTML</title>
		<link>http://blog.reallywow.com/archives/123</link>
		<comments>http://blog.reallywow.com/archives/123#comments</comments>
		<pubDate>Mon, 01 Mar 2010 16:08:18 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[ADS]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=123</guid>
		<description><![CDATA[Here&#8217;s what I know: you can embed a set of &#60;meta/&#62; tags containing citation metadata in your HTML to help Google Scholar to index your content. We&#8217;ve been doing it at ADS for quite a while. I&#8217;m not certain if the impetus came directly from Google, or, more likely, we got the idea from a [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=123"><!-- &nbsp; --></abbr>
<p>Here&#8217;s what I know: you can embed a set of &lt;meta/&gt; tags containing citation metadata in your HTML to help Google Scholar to index your content. We&#8217;ve been doing it at <a href="http://ads.harvard.edu">ADS</a> for quite a while. I&#8217;m not certain if the impetus came directly from Google, or, more likely, we got the idea from a <a href="http://www.crossref.org/CrossTech/2008/05/natures_metadata_for_web_pages_1.html">CrossTech blog post</a> by Tony Hammond that describes the technique.</p>
<p>For example, if you execute <code class="codecolorer bash default"><span class="bash">&nbsp;curl <span style="color: #660033;">-s</span> http:<span style="color: #000000; font-weight: bold;">//</span>adsabs.harvard.edu<span style="color: #000000; font-weight: bold;">/</span>abs<span style="color: #000000; font-weight: bold;">/</span>1977NuPhB.126..298A <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">grep</span> meta</span></code> you should see:</p>
<div class="codecolorer-container html4strict default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:300px;"><div class="html4strict codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">...<br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_language&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;en&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_doi&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;10.1016/0550-3213(77)90384-4&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_abstract_html_url&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;http://adsabs.harvard.edu/abs/1977NuPhB.126..298A&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_title&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;Asymptotic freedom in parton language&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_authors&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;Altarelli, G.; Parisi, G.&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_issn&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;0550-3213&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_date&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;08/1977&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_journal_title&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;Nuclear Physics B&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_volume&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;126&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_firstpage&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;298&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
<span style="color: #009900;">&lt;<span style="color: #000000; font-weight: bold;">meta</span> <span style="color: #000066;">name</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;citation_lastpage&quot;</span> <span style="color: #000066;">content</span><span style="color: #66cc66;">=</span><span style="color: #ff0000;">&quot;318&quot;</span> <span style="color: #66cc66;">/</span>&gt;</span><br />
...</div></div>
<p>Since first implementation we&#8217;ve had some back-and-forth with Abhishek Jain at Google Scholar to ensure we&#8217;re making use of the full set of fields that Google Scholar looks for.*</p>
<p><a href="http://onebiglibrary.net/">Dan Chudnov</a>, David Bucknum &amp; <a href="http://inkdroid.org/journal/">Ed Summers</a> at the LoC recently expressed interest in also embedding these tags. In the absence of official reference from the Google Scholar folks, I figured it would be a good thing to post here.</p>
<ul>
<li>citation_language</li>
<li>citation_doi</li>
<li>citation_abstract_html_url</li>
<li>citation_title</li>
<li>citation_authors</li>
<li>citation_issn</li>
<li>citation_date</li>
<li>citation_journal_title</li>
<li>citation_volume</li>
<li>citation_firstpage</li>
<li>citation_lastpage</li>
<li>citation_publisher</li>
<li>citation_issue</li>
<li>citation_pdf_url</li>
<li>citation_pmid</li>
<li>citation_keywords (multiple instances OK)</li>
<li>citation_conference</li>
<li>citation_dissertation_name</li>
<li>citation_dissertation_institution</li>
<li>citation_patent_number</li>
<li>citation_patent_country</li>
<li>citation_technical_report_number</li>
<li>citation_technical_report_institution</li>
</ul>
<p>I had to cull this list via a visual scan of a long, forwarded e-mail thread. So, like I tried to insinuate above, it sure would be great if Google Scholar would publish an official reference to this schema somewhere.</p>
<p>* all instances of the term &#8220;we&#8221; should really be read as &#8220;my boss, Alberto&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/123/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Contextual Inquiry on the Cheap</title>
		<link>http://blog.reallywow.com/archives/112</link>
		<comments>http://blog.reallywow.com/archives/112#comments</comments>
		<pubDate>Thu, 31 Dec 2009 15:09:47 +0000</pubDate>
		<dc:creator>lbjay</dc:creator>
				<category><![CDATA[ADS]]></category>
		<category><![CDATA[UX]]></category>

		<guid isPermaLink="false">http://blog.reallywow.com/?p=112</guid>
		<description><![CDATA[I thought I&#8217;d share the interview outline I&#8217;ve been using to conduct some low effort contextual inquiry sessions with ADS users. Classic contextual inquiry, in which the researcher sits with or shadows a person in the context of the subject&#8217;s own working environment, is often conducted in 3+ hour sessions, frequently with all manner of [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="http://blog.reallywow.com/?p=112"><!-- &nbsp; --></abbr>
<p>I thought I&#8217;d share the <a href="http://docs.google.com/View?id=df2kgdvp_272d9mbxrfg">interview outline</a> I&#8217;ve been using to conduct some low effort contextual inquiry sessions with <a href="http://ads.harvard.edu">ADS</a> users.</p>
<div id="attachment_116" class="wp-caption alignright" style="width: 160px"><a href="http://docs.google.com/View?id=df2kgdvp_272d9mbxrfg"><img class="size-thumbnail wp-image-116 " title="interview outline" src="http://blog.reallywow.com/wp-content/uploads/2009/12/interview1-150x150.png" alt="" width="150" height="150" /></a><p class="wp-caption-text">thumbnail links to google doc</p></div>
<p>Classic <a title="Contextual inquiry - Wikipedia, the free encyclopedia" href="http://en.wikipedia.org/wiki/Contextual_inquiry">contextual inquiry</a>, in which the researcher sits with or shadows a person in the context of the subject&#8217;s own working environment, is often conducted in 3+ hour sessions, frequently with all manner of video capturing equipment. My goal is cut that time down to 30 minutes, partly because this whole user research thing is supposed to be a part-time endeavor, and also because the majority of ADS users are PhD&#8217;s, and we all know just <a href="http://www.nytimes.com/2009/09/22/technology/internet/22netflix.html?_r=2&amp;ref=technology&amp;pagewanted=all">how valuable their time is</a>.</p>
<p>So far I&#8217;ve only managed to conduct four of these interviews (with two more scheduled). Would love to get a total of 10. Since I don&#8217;t have access to video equipment I simply mash out typewritten, poorly spelled notes as fast I can. The notes have a stream of consciousness flavor, but the early indications are that the information gathered will be valuable.</p>
<p>Example notes:</p>
<pre style="padding-left: 30px;">refers to bibcode as "indexing thing". "not any use to me."
wrote a perl script that rewrites the bibcode into something understandabl
other strategies for searching for particular star: entering star name into abstract search or title search.
finds one article using abstract search.
mentions that he doesn't know boolean sytnax by memory
to find more tries going to simbad and finds alternate names for the star</pre>
]]></content:encoded>
			<wfw:commentRss>http://blog.reallywow.com/archives/112/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
