hash_bucket()

Archive for September 2006

…seems to be one of the keywords that a lot of people find this Blog on, so I figured I’d take a few minutes to explain the problem in more detail. Most of the below information can be found on MSDN, but maybe I can put it in a more concise and easy to grasp language.(?)

The problem of encountering a collected delegate really needs to be understood in the context of the workings of the Garbage Collector (GC) that is responsible for reclaiming and tidying up the memory no longer in use by managed code. To put it simple, when no more references exist to a managed piece of memory, that piece or area can safely be marked as available and overwritten with new data. How does this happen in practice? I’ll start by explaining some basics. Let’s look at some code:

class Program
{
  class Point
  {
       public int X;
       public int Y;

       public Point(int x, int y)
      {
          this.X = x;
          this.Y = y;
      }
  }

  static void Main(string[] args)
  {
        Point p = new Point(0, 0);
  }
}

In this very simple piece of code we have our standard console application starting point, the Program class with its Main function, an internal class to represent a Point and an instance of this Point class within the Main function.

No sweat so far right, we all know that when the Point is declared and instantiated a reference that points to a memory area within the managed heap is created on the stack. Thus what p in this case really is is a Pointer on the stack pointing to a location whitin the managed heap where our real data lives.

Now for the execution lifetime of this very short program the piece of memory on the managed heap that contains our Point data will remain “tagged” as being “in use” by the program, the reason for this is that p never goes out of scope. That is, a reference to this memory will always exist. When the GC makes its irregular sweep through our managed heap memory it will notice this and thus it wont touch this piece of memory but leave it intact.

(Actually it might move it around in memory to speed things up for our program. Should this happen our pointer p will be updated (by the GC) to point to the new location.)

Later in our program, if another piece of code tries to read the data that p points to it will find it (intact) by looking at the location that p points at. Still no sweat? Good.

Let’s look at another example:

  class Program
  {
    class Point
    {
      public int X;
      public int Y;

      public Point(int x, int y)
      {
         this.X = x;
         this.Y = y;
      }
    }

    static void Main(string[] args)
    {
        CreateAPoint();
    }

    private void CreateAPoint()
    {
        Point p2 = new Point(0, 0);
    }
  }

In this sample our program has been extended with a fancy method, CreateAPoint, that creates a Point. This method is called in our Main method and it does nothing but declare and instantiate a Point called p2. After this the method exits.

All the stuff said above still goes for this new Point p2, but the interesting thing happens when we exit the method. As I said above, the memory pointed at by our previous Point p will remain tagged for the execution lifetime of our program, the reason being that p never went out of scope. In our new piece of code the variable p2 actually looses its scope as soon as the method exits. Thus, after CreateAPoint has exited, the next time the GC sweeps the memory, it will find an area occupied by the data declared by p2, but no active references to it. In other words, there are no variables pointing at this piece of data. Thus, from the GCs point of view it can be safely tagged as available to be overridden by new data, or in GC terms, it will be Collected.

Actually all the Reference Types:

  • Classes
  • Boxed Value Types
  • Arrays
  • Delegates

behave in the same way, and as you can see from the list this includes our main character for this post, the Delegate.

So what is a Delegate. Well if you ask a C++ programmer she will tell you that it is a function pointer, or sometimes a callback function, and while they share many similarities, there are also some important differences (the delegate being type-safe for example). In C++ a function pointer is used to pass a method as a parameter to another method. A very useful feature indeed. In C# one of the most common usages of delegates is to declare Events and Callbacks that enable various objects in our code to report state and pass information between them.

The above can be seen in the following sample code:

 delegate void SomethingHappened(string What);

class Program
{
    public static event SomethingHappened SomethingHappenedInstance;

    static void Main(string[] args)
    {
        SomethingHappenedInstance += new SomethingHappened(Program_SomethingHappenedInstance);
        SomethingHappenedInstance(“Roger Wilco has left the Building”);
    }

    static void Program_SomethingHappenedInstance(string What)
    {
         Console.WriteLine(What);
     }
  }

This piece of code declares a delegate called SomethingHappened that takes a string as its parameter. What it does with this string is entirely up to our own implementation of this delegate as we will soon see. Within our Program class we then declare an event that appears to be of type SomethingHappened. We call it SomethingHappenedInstance. Next in our code is where the magic happens. 

I said earlier that a delegate can be thought of as a pointer to a method or a callback method. To explain this in a simple way we think of it as containing a list of methods that matches its signature (in this case it takes a string and returns nothing) and invokes these methods one by one when it is “activated”. In our case its list contains only one method (called Program_SomethingHappenedInstance) that we added in the line

SomethingHappenedInstance += new SomethingHappened(Program_SomethingHappenedInstance);

When we invoke our SomethingHappenedInstance:

SomethingHappenedInstance(“Roger Willco has left the Building”);

It will call all the methods in its “list” and pass the string we sent in to each and every one of them.

Good so now we have a basic idea of what a delegate is. Remember though that since a delegate instance is a reference type it obeys the same laws about scope and garbage collection as the first little Point class I showed.

On to the juicier parts. What about those pesky collected delegates? Well by now you should be able to understand the problem of a CallbackOnCollectedDelegate basically by looking at the name itself. Or at least to some extent. Actually there is another tiny matter that also needs to be understood in order to come to grips with the CollectedDelegate problem. Basically the problem will only arise when the callback begins its life in the world of managed code, and then is passed through interop to unmanaged code. The reason for this is that once the address that the method pointer points at has been sent of to Unamanaged Land, there is no way for the GC to know that there still exists a reference to this address. This is of course because whatever data is handled in unmanaged code lives outside the managed heap, and thus out of sight from the GC. With this said lets move on.

Since a delegate is a reference type, the memory it points to at the heap will be collected whenever it goes out of scope, that is, the GC will mark the memory region it pointed to as “empty” (that is available for use), and the callback that tries to call the delegate will call into open space. If this was tricky and abstract lets look at a real world example.

Say I send you a postcard that says “Please send this back to the same address it came from as soon as you get it!”. Then after sending it off I move to a new address. You send the card back but when it arrives where I used to live I am nowhere to be found. This will make a lot of people sad. I will be sad because I never got your card, the post man will be sad and confused because he can’t do his job and you will be sad because eventually the card will return to you with an error message attached to it.

So, let’s look an example where the CallbackOnCollectedDelegate will NOT happen:

 delegate void SomethingHappened(string What);

class Program
  {
  public static event SomethingHappened SomethingHappenedInstance;

static void Main(string[] args)
  {
  SomethingHappenedInstance += new SomethingHappened(Program_SomethingHappenedInstance);
  DoSomething(SomethingHappenedInstance, “Print a line”);
  Console.Read();
  }

static void Program_SomethingHappenedInstance(string What)
  {
  Console.WriteLine(What);
  }

static void DoSomething(SomethingHappened callback, string whatdoto)
  {
  Console.WriteLine(“hej”);
  callback(“Done”);
  }
  }

In this case we have a function that will DoSomething. It takes a callback that it can invoke to tell us when it is done and a string that defines an action. In this case, since our callback, which is a delegate, was defined in the class scope of our Program class it will remain in-use, untouched by the GC for the execution lifetime of our program. This means that whenever the DoSomething invokes the callback it will always find its way back to the correct method in memory that it points to.

However, if the method that we send the callback to lives in unmanaged code, and the location of the method that the callback points to in memory gets Collected by the GC, then as the unmanaged code tries to invoke the callback it will call into nothing. Giving rise to a CallbackOnCollectedDelegate.

Did that make sense?

Here is a link to an article on MSDN that also explains the whole issue:

http://msdn2.microsoft.com/en-us/library/43yky316.aspx

I for one find it increasingly difficult to remember all the interesting URL’s that I stumble upon when meandering the web. I am also very bad at actually making use of bookmarks, as I often find that the name of a bookmark fails to give an adequate description of where it points. Instead, I am more and more depending on Google to find me the page I was looking for, by doing a search on the same keywords that made the site stick in my memory in the first place…

Also, with the rapidly decreasing supply of available domain names, a lot of smaller companies or start-ups find it difficult to register a logical domain that reflects the name of their business. Given these two “facts”, the only logical thing to do if course is to register a name that is loosely related to your company, and then inform your audience on what search terms they should use to locate you on Google.

I’m not sure about the rest of the world, but here in Tokyo it has become increasingly common to see advertisements on trains and billboards that simply feature a text-box with a keyword and a button next to it that says “Search”. Some of my favorites include the words “1600yen” and “Eat Fish Now”. I also learnt recently that the car maker Pontiac had done a similar marketing stunt by simply using a screen-shot of Google with their latest car model name in the search box. (Google apparently received no royalty in this case.)

I guess the point is that who needs a dns or even a domain name, when we can just use google, or in other words, is google the new “dns”?

スウェーデンの実家の近くにある教会の中に、伝統的な結婚式をあげるため、帰国して来ました。すごく楽しかったです。佐知子と二人で教会の前に立って、愛の約束して、外でお米に浴びて、フルスペックウエッディングでした。:)日本から佐知子のご両親も参加してくれてのはとても嬉しかったです。スウェーデンの友人達もいっぱい仕事から休んで、ベトランダまで来てくれてもとてもありがたい事です。そしてもちろんうちの両親も一所懸命頑張って素敵な結婚式をかんりしてくれた、本当に成和になりました。

皆さんにどうやって「ありがとう!!」って伝えられるのは本当に分からないですが、皆さんのおかげで人生の中のものすごく嬉しい思い出になりました!

さちゃん! これからも、ずっと一緒にラブラブでね!!::))))

Please observe that this post is entirely speculative and is in no ways based on any methodological research. It is merely posted as an open question.

When thinking about the vast “clusters of in-disposable information” floating around the universe according to Swedish astronomer and philosopher Peter Nilsson, in relation to what is known as the Web, it is sometimes an encouraging thought with just a few clicks and a well formulated search-string, thousands of corporate websites, myriads of forums and communities and countless blogs and personal websites are now within every connected persons reach.

Though maybe not exactly what Engelbart, Bush, Nelson, Kay and the others had in mind, it sure does seem to realize at least some of the basic ideas of the Hyperlinked Knowledge-space, the Augmentation of the human intellect and the Ever ubiquitous computer.

In all its glory and all is good under the heavens, still I for one have grown increasingly lazy in my meandering of the www. Sure enough I use google everyday to look up details, refresh my memory on something long forgotten or simply to cross reference, but I also find, and maybe even more so lately, that the actual sources I turn to most frequently have been reduced in numbers to maybe 8-10 websites. There are a number of reasons for this. One is that my areas of interest are maybe a bit narrow nowadays, mostly related to work. Thus any web search I perform tends to turn up the same 10-15 sites of which I have already disposed some as being no-good, and some that require me to pay money or subscribe to piles of spam. Leaving me of course with a small subset of 5-6 pages that I keep returning to as my main sources for information.

So why is it that in such a vast ocean of resources, I keep to my own little pond of frogs and no longer allow myself to drift with the waves? Well, one thing that I have found very common, at least when it comes to problem solving, specifically in programming, is that as a few sites with large user communities tends to score high in the search engines and thus grow even more popular, it is within these sites that most of the information I find in minor sites and blogs around the web originate. This is perhaps best illustrated with a fake example. (There are real ones as well but forgive for being lazy and not taking the time to look one up right now.)

Say developer A is trying to solve a problem that involves reading non uniform length CSV files with floating point numbers into an array in a bare bone C program. A is an intermediate C programmer and could probably solve this task in about 2-3 hours at most, but for lack of time and inspiration he turns to the web to see if someone has a small piece of code that he can copy and use as it is. After searching the web for a few minutes he stumbles upon Forum X where C developers discuss C programming problems and solutions. Here A posts a question, asking for a small function that would perform his task, then signs off and turns his attention to other parts of his program for a couple of hours. Checking back at Forum X later in the afternoon reveals a helpful reply from Developer B, that happens to sit on just such a function the he himself found on a Blog, copied and used in his own program. A thanks B, copies the code and adds it to his own source code. Sometime later, developer C, faced with the same problem finds the same Forum X, sees no need to look any further, copies the code and for good manners pastes the very same to his weblog for others to enjoy.

As this pattern repeats over and over again, the same little solution keeps getting copied, republished and reused. In a good scenario at least the very first source becomes the starting point of endless variations on the same theme, perhaps not rendering very novel solutions but at least serving as studying material for some. In a bad scenario, the solution simply gets copied and pasted again and again, requiring no understanding of its inner logic or workings on the part of the user. My favourite part is when a web search for a common problem leads to the exact same source code turning up at dozens of locations all accredited to different people…

While I am in no way against the posting, reuse and redistribution of information or solutions, I would also like to pose the question, is there no inherent danger at all in this scenario? Is there no risk of stagnation of novel solutions? What if the original solution has a bug or a flaw in its internals? The above example may be overly simplified for the sake of an argument, still is this not a pattern that can be seen in for example undergraduate scientific reports? Visual designs?

Of course the web also features a sometimes very strict and efficient “peer-review system” leading to the elimination of unreliable sources, correction of errors in proposed solutions or the rewarding of credit where credit is due, but this peer-reviewing is in no way 100% reliable, especially since the number of available source for any popular piece of information tends to grow exponentially on the web…

To sum it up, given that so many solutions (at least in the world of programming) gets copied and republished in endless repetition, is there no risk of innovation in problem solving stagnating, or of aspiring developers not learning to perform the very basic steps of programming because they keep copying and pasting the same code over and over again?

I for one would just like to point out that forums for example are not code repositories, reading blogs cannot replace reading the documentation and if you want to become a good programmer always try to solve your problems and tasks your self before asking for solutions on-line (at least that way you will know how to pose a good question…).

Also, yes do use forums and communities alot! They are a great place for getting feedback to your designs and ideas, looking at how others solved similar issues, making friends and asking questions once you actually do get stuck. But again, they are not code repositories…


.

This blog has no clear focus. It has a focus though, it's just not very clear at the moment...

Dev Env.

Visual Studio 2008 Prof / NUnit / Gallio / csUnit / STools (ExactMagic) / doxygen / dxCore / TypeMock / TestDriven.net / SequenceViz / CLRProfiler / Snoop / Reflector / Mole / FxCop / Subversion / TortoiseSVN / SlikSVN / CruiseControl.net / msbuild / nant

Blog Stats

  • 81,346 hits