Forums are not Code Repositories!

Posted on: September 1, 2006

Please observe that this post is entirely speculative and is in no ways based on any methodological research. It is merely posted as an open question.

When thinking about the vast “clusters of in-disposable information” floating around the universe according to Swedish astronomer and philosopher Peter Nilsson, in relation to what is known as the Web, it is sometimes an encouraging thought with just a few clicks and a well formulated search-string, thousands of corporate websites, myriads of forums and communities and countless blogs and personal websites are now within every connected persons reach.

Though maybe not exactly what Engelbart, Bush, Nelson, Kay and the others had in mind, it sure does seem to realize at least some of the basic ideas of the Hyperlinked Knowledge-space, the Augmentation of the human intellect and the Ever ubiquitous computer.

In all its glory and all is good under the heavens, still I for one have grown increasingly lazy in my meandering of the www. Sure enough I use google everyday to look up details, refresh my memory on something long forgotten or simply to cross reference, but I also find, and maybe even more so lately, that the actual sources I turn to most frequently have been reduced in numbers to maybe 8-10 websites. There are a number of reasons for this. One is that my areas of interest are maybe a bit narrow nowadays, mostly related to work. Thus any web search I perform tends to turn up the same 10-15 sites of which I have already disposed some as being no-good, and some that require me to pay money or subscribe to piles of spam. Leaving me of course with a small subset of 5-6 pages that I keep returning to as my main sources for information.

So why is it that in such a vast ocean of resources, I keep to my own little pond of frogs and no longer allow myself to drift with the waves? Well, one thing that I have found very common, at least when it comes to problem solving, specifically in programming, is that as a few sites with large user communities tends to score high in the search engines and thus grow even more popular, it is within these sites that most of the information I find in minor sites and blogs around the web originate. This is perhaps best illustrated with a fake example. (There are real ones as well but forgive for being lazy and not taking the time to look one up right now.)

Say developer A is trying to solve a problem that involves reading non uniform length CSV files with floating point numbers into an array in a bare bone C program. A is an intermediate C programmer and could probably solve this task in about 2-3 hours at most, but for lack of time and inspiration he turns to the web to see if someone has a small piece of code that he can copy and use as it is. After searching the web for a few minutes he stumbles upon Forum X where C developers discuss C programming problems and solutions. Here A posts a question, asking for a small function that would perform his task, then signs off and turns his attention to other parts of his program for a couple of hours. Checking back at Forum X later in the afternoon reveals a helpful reply from Developer B, that happens to sit on just such a function the he himself found on a Blog, copied and used in his own program. A thanks B, copies the code and adds it to his own source code. Sometime later, developer C, faced with the same problem finds the same Forum X, sees no need to look any further, copies the code and for good manners pastes the very same to his weblog for others to enjoy.

As this pattern repeats over and over again, the same little solution keeps getting copied, republished and reused. In a good scenario at least the very first source becomes the starting point of endless variations on the same theme, perhaps not rendering very novel solutions but at least serving as studying material for some. In a bad scenario, the solution simply gets copied and pasted again and again, requiring no understanding of its inner logic or workings on the part of the user. My favourite part is when a web search for a common problem leads to the exact same source code turning up at dozens of locations all accredited to different people…

While I am in no way against the posting, reuse and redistribution of information or solutions, I would also like to pose the question, is there no inherent danger at all in this scenario? Is there no risk of stagnation of novel solutions? What if the original solution has a bug or a flaw in its internals? The above example may be overly simplified for the sake of an argument, still is this not a pattern that can be seen in for example undergraduate scientific reports? Visual designs?

Of course the web also features a sometimes very strict and efficient “peer-review system” leading to the elimination of unreliable sources, correction of errors in proposed solutions or the rewarding of credit where credit is due, but this peer-reviewing is in no way 100% reliable, especially since the number of available source for any popular piece of information tends to grow exponentially on the web…

To sum it up, given that so many solutions (at least in the world of programming) gets copied and republished in endless repetition, is there no risk of innovation in problem solving stagnating, or of aspiring developers not learning to perform the very basic steps of programming because they keep copying and pasting the same code over and over again?

I for one would just like to point out that forums for example are not code repositories, reading blogs cannot replace reading the documentation and if you want to become a good programmer always try to solve your problems and tasks your self before asking for solutions on-line (at least that way you will know how to pose a good question…).

Also, yes do use forums and communities alot! They are a great place for getting feedback to your designs and ideas, looking at how others solved similar issues, making friends and asking questions once you actually do get stuck. But again, they are not code repositories…


