Russ Nelson's blog
rants and raves
http://blog.russnelson.com/opensource/index.atomRuss Nelsonhttp://blog.russnelson.com/opensource/index.atomme@russnelson.comCopyright 2021 Russ Nelson
Pyblosxom hhttp://pyblosxom.github.com/ 1.5.4.dev
2011-01-28T20:54:00ZFixing RSShttp://blog.russnelson.com/2011/01/28/fixing-rss2011-01-28T20:54:00Z2011-01-28T20:54:00Z
<p>RSS is a great technology, with a flaw. Currently, the way RSS works is
that you have a URL pointing to an RSS feed. This feed dynamically changes
to contain new entries as they're posted and removes old ones off the end as
they age out. But the "feed" is a complete XML document. So in order to get
updates, you need to fetch the entire document and examine it for entries
newer than the last time you fetched the document.</p>
<p>In other words, polling.</p>
<p>The problem with polling is that it consumes resources. Much better to
set up a communication point where the receipient (RSS client) waits for
the sender to post new items. In other words, streaming RSS, or SRSS.
Streaming XML is not a new concept. Jabber has been doing streaming XML for
years now. The problem with following that concept is that Jabber is
one-to-one communications. RSS is one-to-many. For small values of "many"
a standard HTTP server will work fine.</p>
<p>But think of Britney Spears' RSSer feed. She'll have millions of followers,
all of whom want to hold a TCP session open. This simply doesn't scale using
a general-purpose TCP/IP stack.</p>
<p>So, imagine a TCP/IP stack designed for streaming RSS. It would be able
to hold open literally millions of TCP sessions at the same time. Since it's
sending out the same content to many different recipients, each session just
needs a pointer into the content that has been sent so far, plus the remote
IP address and TCP port, and maybe a retransmission timer or two.</p>
<p>When she posts something to her feed, it will be sent out using just two
packets: one with the data, and another one ACKing the data. And yes, some
of the TCP sessions will be dangling and will send a TCP RST. But the rest
will receive their feed in real time, or as near as you can get there with 5 million TCP sessions to feed.</p>
<p>Now, all RSS clients will need to be modified to use SRSS. But the key here
is that even if they don't, they'll still be able to fall back to RSS. As
long as the server can understand an appended ?streaming=yes on its feed URL,
the clients can be modified at whatever rate the author desires.</p>
<p>Been thinking about this for years, but I was prompted to write this up
by a posting by Dave Winer suggesting that we could <a href="http://scripting.com/stories/2010/09/22/rebootingRssInterlude.html">distribute the functionality of Twitter</a> using an RSS client and people's individual RSS feeds. This is a great idea at that level of the IP stack, but
when you go down one level to try to implement it, fetching a full RSS file
every time you check for news is incredibly inefficient and slow. Much
better to use SRSS so that when somebody posts to their RSSer feed, it
appears immediately.</p>
Archiveshttp://blog.russnelson.com/2009/04/01/open-data2009-04-01T15:30:00Z2009-04-01T15:30:00Z
<p>Are you not a coder? Or are your coding skills rusty, having moved
on? No matter! You can still contribute to open source. Open source
is only one part of a program. The other part is open data. I'm
encouraging people to contribute to OpenStreetMap. We're running <a href="http://community.cloudmade.com/event">OpenStreetMap mapping
parties</a> all over the world. All skills taught! What's important
is your willingness to contribute to an Open Data project, and
location, location, location. We can only map where you are.</p>
Cloudmade, my new employerhttp://blog.russnelson.com/2009/01/21/cloudmade2009-01-21T08:25:00Z2009-01-21T08:25:00Z
<p>After 17 years of working for myself, I've decided to fire my boss, and hire
a new one, <a href="http://cloudmade.com/">Cloudmade</a>. We're working on
improving <a href="http://openstreetmap.org/">OpenStreetMap</a>, a community
edited map. All sorts of geodata can, should, and need to be added to
OpenStreetMap. I'm available to give presentations about open data,
OpenStreetMap, and collaborative communities in the NorthEast of the USA.</p>
<p>I'm also blogging over at the <a href="http://community.cloudmade.com/blog/">community Cloudmade</a> site.</p>
findwhistlehttp://blog.russnelson.com/2008/11/01/findwhistle2008-11-01T05:46:00Z2008-11-01T05:46:00Z
<p>I've experimented with keeping an audio recording in addition to a
GPS track of my bicycle rides. The trouble with a continuous audio
recording is that 1) it's long, 2) it's boring, and 3) the interesting
things are hard to seek to. If you could do reliable speech
recognition, you could say a word like "mark" or somesuch. However,
in my experience, street noise is going to kill you.</p>
<p>Better than that, you detect a whistle. The code below will print the
duration of the whistle, the time from the beginning of the audio recording,
and the pitch of the whistle. The purpose of this is to be able to do
continuous audio recording, and yet be able to take a waypoint with an
audio annotation.</p>
<code>
<pre>
#!/usr/bin/python
import sys
import wave
import struct
def findwhistle(inwave):
"""given an open wave file, return an array which consists of the times
whenever a whistle was found."""
framecount = 0
zerocross = 0
lastzerocross = 0
zerocrosssum = 0
zerocrosscount = 0
sign = 1
while True:
frames = inwave.readframes(100)
if len(frames) == 0: break
frames = struct.unpack("<100h", frames)
for i, sample in enumerate(frames):
if sign * sample > 0:
zerocross += 1
else:
if abs(zerocross - lastzerocross) <= 1:
zerocrosssum += zerocross
zerocrosscount += 1
else:
if zerocrosscount > 100:
print '! %4.2f %4.2f %5.0f' % ( zerocrosssum / 8000.0, (framecount + i - zerocrosssum) / 8000.0, zerocrosscount / (zerocrosssum / 8000.0))
zerocrosssum = 0
zerocrosscount = 0
#print zerocross
sign = -sign
lastzerocross = zerocross
zerocross = 1
framecount += len(frames)
return framecount / 8000.0
def main():
f = wave.open(sys.argv[1], "r")
print f.getparams()
print findwhistle(f)
if __name__ == "__main__":
main()
</pre>
</code>
SciPhone && Open Sourcehttp://blog.russnelson.com/2008/10/29/SciPhone2008-10-29T04:38:00Z2008-10-29T04:38:00Z
<p><a href="http://www.dealextreme.com/details.dx/sku.13241">These guys (SciPhone)</a> really REALLY ought to get together with some open source developers.
Looks like a great product, but it's almost 100% certain that their software
stinks.</p>