Social Storm - Breaking News and the Saturation of the Internet

Michael Jackson's tragic death this past week created an unprecedented wave of traffic on the Internet that temporarily overloaded many sites and services. As CNN noted, the advent of social media has dramatically increased the amplifying effect of big news stories. Several years ago, news stories reported reactions from the blogosphere as the voice of ordinary people. Today, ordinary people all over the world can communicate to many people in real time using services like Facebook and Twitter. It seems as though overnight our collective ability to generate traffic on the Internet has spiked far beyond the capacity of current systems to deal with peak load in the face of dramatic events like Jackson's death.

More significantly, we are only just seeing the very beginning of the bandwidth-gobbling revolution known as social media. As AT & T is fast learning in the US market, mobile Internet usage will skyrocket as handsets and services become more sophisiticated and mobile bandwidth grows. Internet engineering experts have been predicting the saturation of the Internet for several years; it appears that we have now arrived at the saturation point during peak usage. How long will it be before Internet brownouts become a common occurence? The answer may depend on when the global economy recovers from its current doldrums. 

Why Wolfram Alpha will not change the world

In case you haven't heard the story about the next "greatest thing since sliced bread", there is a new search engine, of sorts, called Wolfram Alpha. Wolfram is basically a big computation engine with a vast store of data to compute against. The press has been agog with the possibilities of the system, calling it a "Google killer" and more. Pardon me for my skepticism, but I don't think Wolfram will change the world.

Sure, it's a handy thing, a very handy thing, but it has significant limits. Getting data into the system is a manual affair that requires human curation, so from the very start of the project, its owners have a pipeline problem - the breadth of the engine's knowledge is limited by the amount of information their human agents can put into it. Unless they change that model, the pipeline problem never goes away. In fact, the problem only becomes bigger over time as humans generate more and more data for potential inclusion into the system.

Wolfram, in essence, has returned to the original Yahoo model of curated content, only using structured data and a more sophisticated search system. Yahoo gave up on curating data manually - it was just too inefficient and costly compared to automated indexing. While Wolfram will excel at answering the kind of complicated mathematical equations that it specializes in, I don't see it outshining Google.

Google's advantage (and disadvantage, as I have discussed previously) is the sheer volume of information in its index. I can search on Google and generally find an answer to a question within a few clicks. Why is that? In short, Google and other automated indexing engines rely on the millions upon millions of people around the world who contribute content to the Internet in the form of web pages, wikis, blogs, and many others. Wolfram, on the other hand, relies on its own staff of curators to ad data to the system.

And let's not forget about Twitter and other social media as the newest form of content contribution. If Twitter embodies Web 2.0, Wolfram Alpha embodies yet another take on Web 1.0. Don't get me wrong, Wolfram will be a boon to people doing some forms of research, but it will never live up to the hype that has been created around it.

Introducing the Colony application platform

For the past three years, I have been working on and off on an open-source CFML-based Web application. It  started out as a simple system to store arbitrary structured and unstructured content. I started using it to build more and more complex Web applications, and over time it grew in size and scope. I thought about it for awhile as a content management system, but content management is not what I was aiming for, and not where the platform has really evolved.

After struggling with terminology and purpose, I started thinking about the application as an application platform. What is that? I see it as an implementation of typical application patterns in an integrated package that allows a develoepr to use it in whole or in part, building on the core libraries to create a new solution. 

Once I had the concept clear in my head, I started casting about for a name. After lots of pondering and brainstorming sessions with my colleagues on the CF-Community list, I decided to call the platform Colony. To me, Colony is all about staking out new territory on the Web, building compelling new services, and advancing the state of software.

Colony is also about shared effort and shared reward. To that end, we have just released the platform in alpha under the Apache Software License 2.0. You can get the alpha code and see more about the platform at www.cfcolony.org. The site is graphically challenged and light on content at the moment, but that wil cahnge soon. 

Ext.History as a Controller for JS Applications

We have been implementing an ecommerce solution using the ExtJS framework, and one of the challenges our clients asked us to meet was to enable Back button support in the fully AJAX-enabled UI. After looking around at possible solutions, I decided to implement the Ext frameworks' Ext.History class as the history provider. Some example code at the Ext site gave me an idea - that it is possible to use the History class on() method to build a controller for a JS application that respects the Back button, allows bookmarking, and provides centralized application flow.

Here is my initial implementation:

     Ext.onReady( function() {
   
        //initialize the History provider
        Ext.History.init();
        //use forward slash for the delimiter
        var delim = '/';

        //set the on('change') event to act as the controller for browser history
        Ext.History.on('change', function(urlString){
            if(urlString){
                var arguments = urlString.split(delim);
                switch(arguments[0]){
                    case 'getPage':
                        getPage(arguments[1],arguments[2]);
                        break;
                    case 'doSearch':
                         doSearch(arguments[1],arguments[2],arguments[3],arguments[4],arguments[5]);
                    break;
                }

            }else{
                //reset the page
                setDefaults();
            }
        });


A typical call to a function that I want to track then changes from a direct call to the function to a call to Ext.History.add, so:

    getPage(#rootnode.objectid#,'webPage')

becomes

    Ext.History.add('getPage/#rootnode.objectid#/webPage');

and now I have a controller for the JS application. I may need to re-structure some code to make it neat and tidy, but I can now see the core of a well-structured solution taking shape.

Google's Mission to Penetrate the Deep Web

Google is building a software program that will conduct searches of public databases on the Web to try to ascertain their contents. The goal behind this move is to index and make available information that is not currently available - like flight schedules and fares, to use an example from the CNet article. This development raises two important questions for consideration. First, are there any legal issues for Google to conduct data mining from public databases? Second, who will pay for the bandwidth and CPU charges for Google's activities?

On the first question, it remains to be seen whether anyone will object on legal grounds to the searches. Google can certainly provide a way for companies to opt out of the searches using standard robot/user agent techniques currently employed to manage search engine crawlers, which may make the legal issues moot. 

On the second question,  there is a very real prospect that Google will add significant traffic to a site's search system, potentially costing the company maintaining the site both in bandwidth and server charges. For sites hosted in a cloud environment, those costs could be precisely quantified. So who will pay for the additional traffic? If Google provides an opt out solution that companies can easily deploy, one could argue that any company that neglects to opt out of the searches is by inference allowing Google to conduct the searches and so agreeing to incur the costs associated with the searches.

On the other hand, one could argue that Google has an obligation to proactively notify companies if it plans to change the way it indexes their systems in a way that may force them to incur additional costs, which effectively takes us back to the first question of legal issues. 

In the bigger picture, Google's move is just a first step in what will inevitably industry attempts to better expose and share data buried in databases around the world.  Though the Semantic Web has so far failed to attract a huge following, we can reasonably expect that either it or some other technology will take hold and begin to shape the next generation of knowedge sharing on the Internet.

Facebook Terms of Service and the Populism of Social Networks

If you are not on Facebook, or if you had not yet heard the news, Facebook recently changed its Terms of Service to allow them more legal room to use the data you store on their system, granting them a perpetual license to use your data in any way they see fit. 

The change prompted an uproar on Twitter among Facebook users, and has now led to a Facebook group called " People Against the new Terms of Service (TOS)". So much for dictating terms to the masses.

What is most intriguing about this development is how word of the TOS (a rather bland piece of legalese that is standard fare for such sites) spread via Twitter and generated a huge public backlash against Facebook, forcing the company to go on the defensive and explain the change to its community. 

As I just recently wrote,  Facebook is an interesting case study in a community that has both real leverage and a curiously fragile hold on that leverage. The rough public reaction to the change in the TOS is just the sort of event that could trigger desertion among the FB faithful and put in peril its meteoric rise. Will it? I doubt it, but we'll see how Facebook handles the backlash. 

Facebook - Looming Social Network Monopoly, or Hint of Open Future to Come?

I was having dinner with friends this evening when the discussion turned to Facebook. A concern was raised - is Facebook becoming a monopoly? If the value of a network expands with the number of nodes on the network, one could argue that Facebook is close to achieving serious market power in the social networking industry. According to the latest stats, Facebook now boasts 150 million members. That's nothing to sneeze at.

However huge their market share at the moment, what market power does Facebook really enjoy? Social networking applications are easy to build and promote. Infrastructure costs for nacsent Internet properties are minimal. Open Internet protocols and public APIs, the hooks that systems expose to other systems for communication purposes, make connecting to Facebook and other sites a snap. 

Facebook has a very good position today, a position built on innovation. So what could knock it off its perch? More innovation. Sooner or later someone will come along that will out-Facebook the people at Facebook. That's just the nature of the Internet, which makes the process of creative destruction almost effortless. You disagree? Just consider that Twitter, the current doyenne of the technorati, has a staff of 29. YouTube, when it agreed to a $1.2 billion buyout, had a staff just north of 120. When is the last time 100 people made a billion dollars worth of anything?

My current thinking is that within five to ten years, social neworking sites will evolve into an interconnected web of systems using standard APIs and some form of shared identity management. That approach will be a huge boon to users but will tend to undermine the market advantages of market leaders like Facebook. The current patchwork of identity plugins (give Facebook your Gmail credentials to import contacts) will not scale over the long term and must be replaced by a more robust, open system, giving customers a better experience and challengers in the space an opportunity to carve out their own niches.

New Developments in Blog Spam

I just saw a few comments posted on an old blog entry that were a somewhat subtle take on the practice of blag comment spamming. I had five entries in a row from various aliases that all had one word, "thanks" in the comments, but that all linked to a random web site - as the user's site - with marketing information for various products in it.

I see the technique as an improvement on the blog spamming practice of posting large numbers of links in the blog comments, but it is still spam and therefore worthy of immediate deletion.Do you see large amounts of spam on blog comments? Have spammers gotten better at hiding their intentions?

What are the Odds? The Issue of Context

I came across an interesting problem using Google today. I was trying to look up some gambling odds (not a gambler myself, just interested in a particular topic). I searched for the term "London bookmaker". Google returned what looked like three pages of metadata about London bookmakers -lists of London bookmakers, news stories about London bookmakers - before it returned an actual link to a bookmaker in London. most people never get to page 3, so as a software geek, that placement really bothers me. Why?

The issue is context. I was trying to find a bookmaker in London. Not any particular bookmaker, just a bookmaker so I could see if they were carrying odds on a particular question. Returning lists of bookmakers that do not include links to their web sites does nothing for me but waste my time. I suppose I could Google for the name of the bookmakers on one of the lists, but shouldn't Google know to do that for me? Here is where I think Google has a real problem. Their ranking formula depends so heavily on popularity that metadata in cases like this will always be ranked far higher than the underlying data itself - the web sites of bookmakers in London, which is really what I need to find.

To me, the problem of context is functional. In my case, I am trying to find out information about a specific service that London bookmakers offer. I'm not going to call London for the information, I need and expect a web site where I can get it. And of course, there are web sites of London bookmakers that offer odds on all sorts of things, but it took Google until page 3 to show me even a single one.

Let's say, for instance, that I wanted to know the odds on whether the Cardinals would win the Super Bowl. I should be able to type in "odds on Cardinals Super Bowl" and get the latest odds back. Shouldn't I? Isn't that the goal of Google, to index everything? (I just tried that and got - guess what - a list of news stories about the odds). I am leaving aside the issues of who would provide the odds and how compensation models would work for the Google results. Perhaps those are sticky questions. Sticky, I would wager, but hardly insurmountable. I wonder what the odds are on that? Maybe Google can tell me. Or maybe not.

Twitter's secret business model and its implications

I started playing with Twitter last week during MAX. It's really quite addictive, I've found. For those of us who spend the majority of our days (and nights) in front of computer screens, Twitter is a very cool way to stay in touch with others.

Twitter rejected a decent offer from Facebook and says their "secret business model" might be worth well more than the Facebook offer. That made me wonder what the secret model was, and I got to thinking about it. Here's what I came up with:

1. They want to sell ads on Twitter traffic. Bingo. It's a very interesting model because individual users with lots of followers would provide much larger revenue streams than people with fewer followers. Does anyone know if any celebrities are on Twitter? Just wait until tweens all get Twitter mobile clients and the next Miley Cyrus-esque Disney star comes along. Which leads to two interesting questions-
could celebrities with lots of followers demand royalties from Twitter for their daily chatter?  And even more weirdly, could a celebrity make a living just being on Twitter? Could there be people whose entire lives and fortunes revolve around their success on Twitter? 

 2. They might mine the data from Twitter for psychographics and deliver ads to them based on that rather than on pure popularity- or perhaps through a blended methodology of some sort.

Whatever their actual plan, it seems like too fat a target for them to resist the opportuntity. That traffic could be worth more than Google. I wonder, are the brains at Google secretly working on a competitor to Twitter? They must be, they would be foolish not to have seen this coming. The big IM client players - Skype, MSN, Yahoo, and AOL could morph their platforms into something like Twitter, and if they can maintain the open compatibility that they have now, they could creae a huge Twitter-like ecosystem overnight. All the IM client players would benefit through ad revenues as messages were delivered from followers on one network to the people they were following on another network.

 Who knows if it will happen, but it certainly looks like there could be a big shift in ad revenues from services like search to what I see as the second coming of push services. This time they might actually work.

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.8.001.