{"id":330,"date":"2005-08-30T19:13:07","date_gmt":"2005-08-30T17:13:07","guid":{"rendered":"http:\/\/frenchfragfactory.net\/ozh\/?p=330"},"modified":"2008-04-25T21:24:38","modified_gmt":"2008-04-25T19:24:38","slug":"rss-thoughts-and-facts","status":"publish","type":"post","link":"https:\/\/planetozh.com\/blog\/2005\/08\/rss-thoughts-and-facts\/","title":{"rendered":"RSS Thoughts and Facts"},"content":{"rendered":"<p>One thing I like about FeedBurner is the little badge they can provide telling how many subscribers are getting their hands on your RSS feeds. Since I&#39;m not planning to use this service, I&#39;ve been thinking about coding something (for WordPress, of course) that could produce the same kind of statistics. Indeed, how many unknown and silent readers, who don&#39;t even come to your site, can you reach via RSS, that is an exciting question.<\/p>\n<p>So, I went through some logging and digging in feed requests, to come out with a few stats, facts and questions&#8230;<\/p>\n<p><!--more--><\/p>\n<ul>\n<li>Some services like Technorati are fetching my 3 feeds in a row : RSS 0.91, RSS 2.0 and Atom. Isn&#39;t this a waste of their resources ? I thought they could only fetch one, say, the most verbose format. I really don&#39;t see the point of parsing less loquacious files.\n<\/li>\n<li><a href=\"http:\/\/www.pluck.com\/\" rel=\"tag\">Pluck<\/a>&#39;s bot (aka PluckFeedCrawler\/2.0) is denied from accessing sites running <a href=\"http:\/\/www.ioerror.us\/software\/bad-behavior\/\" rel=\"tag\">Bad Behavior<\/a>, because of the lack of &#39;Accept&#39; HTTP header. I hope this will be fixed soon, since it looks like a rather nice service, almost as useable as Bloglines.\n<\/li>\n<li>People running a bot on <em>fc-e1.feedcache.net<\/em> are real dumbasses. Their bot sends absolutely no headers, not even a User Agent, and fetches my feed exactly 4 times in a row in 4 seconds, every fikin&#39; hour. By the way, why isn&#39;t it blocked by Bad Behavior when sending no HTTP headers ?\n<\/li>\n<li>Someone running intraVnews on <em>florent.netbios.fr<\/em> is polling my RSS feed every 10 minutes. What a misuse of bandwitdh considering the frequency of my posts :)\n<\/li>\n<li>Obviously the RSS market is still growing, judging by the number of bots and services that crawl feeds when associated websites are not open yet : <em>topicblogs.com<\/em>, <em>blogslive.com<\/em>, <em>zilbo.com<\/em>, <em>socialmarks.com<\/em>, <em>feedcache.net<\/em>.\n<\/li>\n<li><em>Socialmarks\/0.2 Beta<\/em> comes from <a href=\"http:\/\/www.socialmarks.com\/\">Socialmarks.com<\/a>, a website which title is &quot;Social <em>Boomkaring<\/em>&quot;. Wow, sounds cool. Let&#39;s <em>boomkar<\/em>, then.\n<\/li>\n<li>I don&#39;t understand why <a href=\"http:\/\/rsscache.com\">RSScache<\/a> is regularly fetching my feed ? I&#39;m not using their service.\n<\/li>\n<li>Some clients are unpolite jerks not clearly telling who they are and where they come from. How am I supposed to find about &quot;<em>Ken<\/em>&quot;, &quot;<em>GM Panel<\/em>&quot; or, I liked this one, &quot;<em>SET USER AGENT<\/em>&quot; ?\n<\/li>\n<li>I&#39;ve found about a few nice looking or promising clients, <del datetime=\"2005-08-30T19:41:35+00:00\">and amongst them <a href=\"http:\/\/www.lektora.com\/\">Lektora<\/a> which I&#39;ll try when I have some time.<\/del> (tried. It sucks)\n<\/li>\n<li>I&#39;ve found about some websites I&#39;m not going to trust. Although <em>blogstreet.com<\/em> and <em>bulkfeeds.net <\/em>are indexing my feed several times a day, I couldn&#39;t even find anything related to my site in their search engines. What a proof of efficiency.\n<\/li>\n<li>Some web-based readers have detailed user agent strings telling how many readers subscribed to your feed (Bloglines, IFeedYou, Rojo, NewsGator, Pluck, Feedster, Yahoo. This is smart, and I hope other similar services will do the same.\n<\/li>\n<li>I&#39;ve tried several web-based readers, and no one is half as cool as <a href=\"http:\/\/bloglines.com\/\" rel=\"tag\">Bloglines<\/a>.\n<\/li>\n<li>&#8230; But I&#39;m still waiting for an invite to try <a href=\"http:\/\/feedlounge.com\/\">FeedLounge<\/a> :\u00c3\u017e\n<\/li>\n<\/ul>\n<p>Now, as for the number of readers &#8230; If I take all requests, remove those from crawling bots and indexing services, and count the number of different pairs of IP + user agents, this give a number of &#8230; 99 different daily readers ??<br \/>\nIs there a flaw in my thinking, or do I really have <em>one hundred <\/em>RSS readers ? Sure, I&#39;d be honored, but it seems way too much to me :P<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Having a look at my logs<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[302,50,73,74,11],"class_list":["post-330","post","type-post","status-publish","format-standard","hentry","category-published","tag-feedburner","tag-geek","tag-rss","tag-stats","tag-web"],"_links":{"self":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts\/330","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/comments?post=330"}],"version-history":[{"count":0,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/posts\/330\/revisions"}],"wp:attachment":[{"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/media?parent=330"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/categories?post=330"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/planetozh.com\/blog\/wp-json\/wp\/v2\/tags?post=330"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}