<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: The Ethics of Web Crawling</title>
	<atom:link href="http://anonymousprof.com/the-ethics-of-web-crawling/feed/" rel="self" type="application/rss+xml" />
	<link>http://anonymousprof.com/the-ethics-of-web-crawling/</link>
	<description>Ramblings and Ravings of an Academic</description>
	<pubDate>Wed, 19 Nov 2008 04:10:42 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: Anonymous Prof &#187; Blog Archive &#187; The Legality of Web Crawling</title>
		<link>http://anonymousprof.com/the-ethics-of-web-crawling/#comment-179</link>
		<dc:creator>Anonymous Prof &#187; Blog Archive &#187; The Legality of Web Crawling</dc:creator>
		<pubDate>Tue, 25 Mar 2008 14:40:42 +0000</pubDate>
		<guid isPermaLink="false">http://anonymousprof.com/the-ethics-of-web-crawling/#comment-179</guid>
		<description>[...] got a great response to from Aaron to my “ Ethics of Web Crawling” post from yesterday and decided to find out where I stood legally on this issue. Luckily, my [...]</description>
		<content:encoded><![CDATA[<p>[...] got a great response to from Aaron to my “ Ethics of Web Crawling” post from yesterday and decided to find out where I stood legally on this issue. Luckily, my [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous Prof</title>
		<link>http://anonymousprof.com/the-ethics-of-web-crawling/#comment-178</link>
		<dc:creator>Anonymous Prof</dc:creator>
		<pubDate>Tue, 25 Mar 2008 01:37:21 +0000</pubDate>
		<guid isPermaLink="false">http://anonymousprof.com/the-ethics-of-web-crawling/#comment-178</guid>
		<description>Hey Aaron, 

That's a great point. I checked for a robots.txt file and there was none. As for their TOS, I'm not sure how to interpret the following:

Under a section titled: "Proprietary Rights; Confidentiality"
"The Website contains the copyrighted material, trademarks and other proprietary information of Kiva and its licensors. Except for that information that is in the public domain or for which you have been given express written permission, you may not copy, modify, publish, transmit, distribute, perform, display, or sell any such proprietary information."

I'm not sure if that covers web crawling or not. Also, b/c my goal for the data is academic and not for profit, would my crawling fall under fair use? Perhaps I should consult a lawyer (luckily, my brother is an expert in this area...looks like he'll be getting a call tomorrow).

Thanks!
-AP</description>
		<content:encoded><![CDATA[<p>Hey Aaron, </p>
<p>That&#8217;s a great point. I checked for a robots.txt file and there was none. As for their TOS, I&#8217;m not sure how to interpret the following:</p>
<p>Under a section titled: &#8220;Proprietary Rights; Confidentiality&#8221;<br />
&#8220;The Website contains the copyrighted material, trademarks and other proprietary information of Kiva and its licensors. Except for that information that is in the public domain or for which you have been given express written permission, you may not copy, modify, publish, transmit, distribute, perform, display, or sell any such proprietary information.&#8221;</p>
<p>I&#8217;m not sure if that covers web crawling or not. Also, b/c my goal for the data is academic and not for profit, would my crawling fall under fair use? Perhaps I should consult a lawyer (luckily, my brother is an expert in this area&#8230;looks like he&#8217;ll be getting a call tomorrow).</p>
<p>Thanks!<br />
-AP</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aaron Schiff</title>
		<link>http://anonymousprof.com/the-ethics-of-web-crawling/#comment-177</link>
		<dc:creator>Aaron Schiff</dc:creator>
		<pubDate>Mon, 24 Mar 2008 23:23:37 +0000</pubDate>
		<guid isPermaLink="false">http://anonymousprof.com/the-ethics-of-web-crawling/#comment-177</guid>
		<description>Many commercial websites have terms of use that prohibit web crawling to collect data. For example airline sites often will not let you collect their price data in bulk. You should check if the site you crawled has such a policy. Also check the server's robots.txt file to see if they've prevented indexing by search engines. If neither of these things apply, I'd say it's safe to assume that the site does not mind their data being crawled (it will have been crawled by Google etc already anyway).</description>
		<content:encoded><![CDATA[<p>Many commercial websites have terms of use that prohibit web crawling to collect data. For example airline sites often will not let you collect their price data in bulk. You should check if the site you crawled has such a policy. Also check the server&#8217;s robots.txt file to see if they&#8217;ve prevented indexing by search engines. If neither of these things apply, I&#8217;d say it&#8217;s safe to assume that the site does not mind their data being crawled (it will have been crawled by Google etc already anyway).</p>
]]></content:encoded>
	</item>
</channel>
</rss>
