Monday, January 30, 2012

VSX Add-in Development Adventures: Window.Object returns NULL

I’m in the middle of developing an add-in for Visual Studio 2008, a simple one that would help our in-house code review process. If you’ve ever written a VS add-in, you’d know the basic routine – Visual Studio creates the basic class Connect derived from IDTExtensibility2, then you put in your code in the methods OnConnection, OnDisconnection, OnAddInsUpdate, OnStartUpComplete and OnBeginShutdown.

I developed a few user controls (UserControl objects) for showing some UI as tool windows. One of them was AddCommentView which is just a dumb UserControl that displays some textboxes and buttons and also takes in some input. I deployed this control as a separate DLL and linked it to the add-in. Then, as usual, the following code should do the job of creating the tool window.
if (_addCommentWindow == null)   
{    
    object objTemp = null;    
    string guid = Guid.NewGuid().ToString();    
    Window2 toolWins = (Window2) this._applicationObject.Windows;

    // Create the tool window view for adding the comment   
    _addCommentWindow = toolWins.CreateToolWindow2(_addinInstance, "Controls.dll", 
        "Controls.AddCommentView", "Add Comment", "{" + guid + "}", ref objTemp);
}    

And like always it worked just fine till I tried to test it and that’s when I realized something wasn’t quite right.
The CreateToolWindow2 method has two outputs – the tool window (which is returned by the method) and the hosted user control (the ref parameter ‘objTemp’ in this case). The hosted user control can also be retrieved through the Window.Object property of the returned tool window. The problem was – no matter what I tried the Window.Object property always returned null and so did ‘objTemp’.
AddCommentView addCommentViewControl = _addCommentWindow.Object as AddCommentView;   
addCommentViewControl.StartLine = selection.AnchorPoint.Line;    
addCommentViewControl.EndLine = selection.ActivePoint.Line;

The solution – I was acting dumb, very dumb. Visual Studio requires all hosted controls to be visible to COM. After adding the attribute to the entire assembly using [assembly: ComVisible(true)] , everything started working like a charm (you can apply it even to the individual control if you don’t want to make the entire assembly COMVisible).

Defining custom Dictionary keys in C#

Ever so often we come across cases where we need to store objects mapped to other objects in a Dictionary. And in most of these cases, the default solution of using the object reference as a key doesn’t really work. For example, if an object Person { Name = “Foo” } is mapped to value ‘x’, another object Person { Name = “Foo” } does not get us to the value ‘x’, although we probably would expect it to return ‘x’.
class Person    
{     
    public int Age { get; set; }     
    public string Name { get; set; }     
}

class Program    
{     
    static void Main(string[] args)     
    {     
        var p1 = new Person() { Age = 30, Name = "p1" };     
        var p2 = new Person() { Age = 31, Name = "p2" };     
        var p3 = new Person() { Age = 32, Name = "p3" };
        var personVsBonus = new Dictionary<Person, int>();    
        personVsBonus[p1] = 1000;     
        personVsBonus[p2] = 2000;     
        personVsBonus[p3] = 3000;
        var personToFind = new Person() { Age = 31, Name = "p2" };    
        Console.WriteLine(personToFind.Name + " will be given a bonus of " + personVsBonus[personToFind]);     
    }     
}     

The above program gives throws a KeyNotFoundException as the dictionary was not able to locate the entry searched as it was doing a reference comparison.
In such cases you need to define your own implementation of what the dictionary should consider as the key value. In the example we just discussed, we might like the property Person.Name to be considered as the key. Whenever a dictionary needs to decide where an item should be located, it calls GetHashCode(), whose return value decides where the item shall be stored. If GetHashCode() returns the same value for two different objects, the Equals() method is called with both objects being compared. The result of the Equals() method tells the dictionary whether the objects are the same or different.


Solution 1:
Implement the IEquatable interface for the Person class. This means that, “Hey, now the Person object will tell you whether it is the same as another Person object!”. You also need to override the Object.GetHashCode() method and write a good hash generating algorithm that generates unique values (int). Here, I use a simple method that just returns the length of the person’s name.
    class Person : IEquatable<Person>
    {
        public int Age { get; set; }
        public string Name { get; set; }
 
        #region IEquatable<Person> Members
 
        public bool Equals(Person other)
        {
            return this.Age.Equals(other.Age) 
                && this.Name.Equals(other.Name);
        }
 
        #endregion
 
        public override int GetHashCode()
        {
            return this.Name.Length;
        }
    }

Run the program after making these additions and the program works fine. The dictionary is now able to find appropriate values.

Solution 2:
There will be times when the key type might be a sealed type or you might not be allowed to change the key object’s properties. The Dictionary type provides us the IEqualityComparer interface that we can implement to provide our custom definitions based on the key object’s properties without having to modify the key’s properties.

Here is the modified code:

    class Person
    {
        public int Age { get; set; }
        public string Name { get; set; }
    }
 
    class PersonEqualityComparer : IEqualityComparer<Person>
    {
        #region IEqualityComparer<Person> Members
 
        public bool Equals(Person x, Person y)
        {
            return x.Age.Equals(y.Age)
                && x.Name.Equals(y.Name);
        }
 
        public int GetHashCode(Person obj)
        {
            return obj.Name.Length;
        }
 
        #endregion
    }

Also remember to use the comparer object while creating the dictionary.

var personVsBonus = new Dictionary<Person, int>(new PersonEqualityComparer());

Run the program after making these changes and the program works just fine!

The most important thing to remember here is that your definition of GetHashCode() shall decide how fast your dictionary can lookup an entry. Be very careful while taking these solutions into consideration as weak hash generation might lead you to long-term scalability issues where the dictionary might take more and more time to lookup entries in large dictionaries.

Sunday, January 29, 2012

TaskList in Visual Studio - Making it better

It was great to come across a new feature in Visual Studio - The TaskList. Well, yes it isn't new but, it certainly was new to me. I had never really come across this wonderful feature until last week when secretGeek blogged that code would suck less if the compiler could raise an error when a // TODO token was detected. A comment from Goran indicated that it was completely possible with the current versions of Visual Studio.
A //TODO: token.
Are shown in VS if you click TaskList. Couldn’t live without this one :)
Yeah, sure. Can’t live without it. Although the compiler doesn’t throw you an error, it indicates all parts of code where such tokens appear and hence can be taken care of easily.

How To Use
You can use the TaskList comments for:
  • Features to be added
  • Problems to be corrected
  • Classes to implement
  • Place-markers for error handling code
  • Reminders to check in the file
Add your comment in the code preceded by the token.    

Once done, open up the TaskList from the View menu.

Once the TaskList is up you will see something like this.

The TaskList also allows custom tokens to be included. Go to Tools –> Options –> Environment –> TaskList.

User Tasks
You can add your own tasks in the TaskList by selecting the ‘User Tasks’ selection from the dropdown list on the TaskList toolbox. These tasks aren’t associated with code, but will allow users to add their own high level tasks.

Extending the TaskList
I would like to see the following additional options in the TaskList:
  • Task Alarms – which would associate an alarm with a high priority comment to help just in case you forget to take care.
  • User Tasks to be more specific (by having a file anchor like ‘Comments’) to allow user tasks to be associated with a particular line in code.
And no, I’m not waiting on Microsoft to extend it in some other version of Visual Studio. I’m starting right away to write a package that would do this for me, hopefully.

More Things to Remember
With Visual Basic projects, the Task List displays all of the comments in the project. With Visual C# and Visual J# projects, the Task List displays only the comments that are found in the files currently opened for edit. With Visual C++ projects, the Task List displays only the comments that are found in the file currently active in the editor.

C# 3.0 for Beginners - Extension Methods - Part 2


Custom Extension Methods

C# 3.0 enables you with the ability to extend the functionality of existing types. This means that you can write your own extension methods that would give the programmer the feeling that those are just methods provided by the existing type.

For example, System.Int32 doesn't provide you an IsEven() method which would tell you if the integer is holding an even value. It would obviously be great if you could write a method which does this, just in case you use this functionality heavily in your code. You would then write a method like this
public bool IsEven(Int32 i)
{
    if(i%2 == 0)
    {
        return true;
    }
    else
    {
        return false;
    }
}

Well, that's perfectly fine. But, I wish we could make it more easier. And I bet we can for sure, that's what extension methods are here for. No, we cannot go and implement the IsEven() method for the Int32 type, but C# 3.0 allows you to extend type functionality in a different way.

Check out the following code. I'm extending the Int32 type to expose the Int32.IsEven() extension method.
public static class MyIntegerExtensionMethod
{
    public static bool IsEven(this Int32 i)
    {
        if (i % 2 == 0)
        {
            return true;
        }
        else
        {
            return false;
        }
    }
}

public class Program
{
    static void Main(string[] args)
    {
        Int32 n = 9;
        Console.WriteLine(n.IsEven());
    }
}

Let's look at the most important part, the method signature.
public static bool IsEven(this Int32 i)

A template for the above would be

<access_modifier> static <return_type> Method_Name(this <extended_type> <instance_of_type>, <args_list>)

While <access_modifer> and <return_type> are intuitive, special attention is needed at the parameters of the method. The first parameter resembles the type which is to be extended, preceded with the 'this' keyword. The arguments that follow are the actual arguments/parameters to the method.

Another thing to remember is that extension methods should be defined as static members in a static class. The only difference between a normal method and an extension method is that the first parameter of an extension is always prefixed with the 'this' keyword. The rest are normal parameters. Also, when using the extension method you should remember to keep the extension method within the scope of its usage or you could even include it using the 'using' directive.

Usage Restrictions
  • Extension methods can access only the public members of the type being extended.
  • If you define an extension method whose signature matches with the method already existing in the extended type, priority is given to the existing method and the new extension method is ignored.
Have fun with this great feature!

C# 3.0 for Beginners - Extension Methods - Part 1

All C# code is eventually executed by the .NET CLR. This requires the C# compiler to transform the LINQ query expressions to a format that is understandable by .NET. LINQ query expressions are transformed into method calls, which are called extension methods. These methods are slightly different from normal methods. Lets discuss them in more detail:

Consider the example that we saw in my previous posts.
    public static void GetIExplorer()
    {
        //  1. Data Source
        Process[] processes = Process.GetProcesses();

        //  2. Query Creation
        IEnumerable<int> query =
              from p in processes
              where p.ProcessName.ToLower().Equals("iexplore")
              select p.Id;

        //  3. Query execution
        foreach (int pid in query)
        {
            Console.WriteLine("Process Id : "+pid);
        }
    }

The query creation in part two transforms in a series of method calls on the data source 'processes' as follows.
    IEnumerable<int> query =
            processes.Where(p => p.ProcessName.ToLower().Equals("iexplore"))
                     .Select(p => p.Id);

The Select clause, however, it not essential. It may be ignored and you might retrieve a list of processes and not their process Ids. The Select clause here specifies the entity to be selected in detail, in the above example, it was the process id belonging to the selected process.

Now, you might wonder that since the Where() method was called on the 'processes' Array, the Array object might have implemented the Where() method. But, that's not quite the case here. The method Where() is called an extension method and is used for extending a type's functionality.

We have also made use of lambda expressions here. We will discuss their significance later in detail, but, for now, you can look at it as if it were a provision to select entities based in certain specified conditions. In the above code the variable 'p' indicates each element in the 'processes' array which is subjected to an expression evaluation (p.ProcessName.ToLower().Equals("iexplore")), and only if the result of the condition evaluation is true, will the item be selected as the result of the query. These are filtering expressions.

In my next post I will describe how to define and use your own extension methods.

C# 3.0 for Beginners - Query Execution in LINQ

The execution of queries written is different from what a new LINQ user would perceive it to be. The query isn’t evaluated and the result is not stored in the query variable until the foreach iteration (where the values are actually required) or a manual iteration using the underlying GetEnumerator and MoveNext methods. The query variable only stores the query commands. This is concept is referred to as deferred execution.
public static void GetIExplorer()
{
    //  1. Data Source
    Process[] processes = Process.GetProcesses();

    //  2. Query Creation
    IEnumerable<int> query =
       from p in processes
       where p.ProcessName.ToLower().Equals("iexplore")
       select p.Id;

    //  3. Query execution
    //  This is the part where the query is evaluated
    //  and result is stored in the query variable.
    foreach (int pid in query)
    {
        Console.WriteLine("Process Id : "+pid);
    }
}

As the query variable doesn’t store the results, you can execute it as many times as you like. If a data source is updated at regular intervals, you could retrieve the latest results every time you iterate over the query variable.


The figure shown above shows information about the query variable ‘query’. The Results View section shows the results of query execution and it also informs the user that “Expanding the Results View will enumerate the IEnumerable” i.e. perform the query execution for debugging.

Forcing immediate execution of queries
Immediate execution of queries is forced for queries that perform aggregation functions over the data retrieved from query evaluation. These functions include Count, Max, Min, Average etc. These queries execute without an explicit foreach statement because the query itself would use a foreach statement to calculate the result.

The following query returns the number of processes identified as “IExplore.exe”.
//  Calculate the number of "iexplore" processes
int numberOfProcesses = query.Count();

Execution results can be cached (if need be) for temporary processing using the ToList() and ToArray() methods as shown below:
//  Caching results using ToList() and ToArray() methods.
List<int> queryResult =
(from p in processes
    where p.ProcessName.ToLower().Equals("iexplore")
    select p.Id).ToList();

Array inferredQueryResult =
    (from p in processes
    where p.ProcessName.ToLower().Equals("iexplore")
    select p.Id).ToArray();


C# 3.0 for Beginners - Learning LINQ - An Overview

I’m probably quite late to provide an overview on LINQ, which is by now a lot more popular than the time when I actually had started hearing about it.

Well, this is expected to be a series of posts that would help other rookies like me who haven’t yet started learning LINQ yet. So, lets get started. To begin with I’m providing here an overview of LINQ (Language Integrated Query).

LINQ was introduced by Microsoft with the objective to reduce the complexity of accessing and integrating information. With the LINQ project, Microsoft has added query facilities to the .Net Framework that apply to all sources of information, not just relational or XML data. Everyday programmers write code that accesses a data source using looping and/or conditional constructs etc. The same constructs can be written using query expressions that are far lesser in code size. LINQ makes it possible to write easily readable and elegant code. The examples that follow will imply how easily understandable LINQ code can be.

LINQ defines a set of standard query operators that you can use for traversal, filter and projection operations. These standard operators can be applied to any IEnumerable<T> based information source. The set of standard query operators can be augmented with new domain-specific operators that are more suitable for the target domain or technology. This extensibility in the query architecture is used in the LINQ project itself to provide implementations that work over both XML (LINQ to XML) and SQL (LINQ to SQL) data. Lets write some code to understand the query operators in more detail:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Diagnostics;

/// <summary>
/// Using standard query operators.
/// </summary>
public static void GetIExplorer()
{
    //  1. Data Source
    Process[] processes = Process.GetProcesses();

    //  2. Query Creation
    IEnumerable<int> query = from p in processes
                             where p.ProcessName.ToLower().Equals("iexplore")
                             select p.Id;

    //  3. Query execution
    foreach (int pid in query)
    {
        Console.WriteLine("Process Id : "+pid);
    }
}

All LINQ query operations consist of three distinct operations:

Identify the data source
Query creation
Query execution

Calling the method would show you the currently running Internet Explorer processes (their process ids). The heart of the method lies in the following statement of our program.
IEnumerable<int> query = from p in processes
                         where p.ProcessName.ToLower().Equals("iexplore")
                         select p.Id;

The expression on the right hand side of this statement is called the query expression. The output of this expression is held in the local variable ‘query’. The query expression operates on one or more information sources by applying the query operators from the standard or domian specific set of query operators. We have used standard query operators here namely where and select.

The from clause select the list of processes which becomes the input for the where operator which filters the list and selects only those elements that satisfy the condition specified with the where operator. The selected elements are then processed by the select operator that determines any specific information selection for each element.

The above statement can also be written using explicit syntax as shown below:
IEnumerable<int> query = Process.GetProcesses()
               .Where(s => s.ProcessName.ToLower().Equals("iexplore"))
               .Select(s => s.Id);

This form of query is called a method-based query and the arguments to the query operators are called lambda expressions. They allow query operators to be defined individually as methods and are connected using the dot notation. I will deal with lambda expressions in my following posts.

Anonymous methods in C#


An anonymous method is just a method that has no name. Anonymous methods are most frequently used to pass a code block as a delegate parameter.

Before C# 2.0, the only way to declare a delegate was to use named methods as shown in the example below

// Create the delegate
public delegate void Display(string message);

// Create the method
public static void DisplayOnConsole(string myMessage)
{
    Console.WriteLine(myMessage);
}

public static void Main(string[] args)
{
    // Here, the delegate is instantiated with the
    // method which should be invoked.
    Display myDisplay = new Display(DisplayOnConsole);

    // Invoke
    myDisplay("Hello World!");
}

Then C# 2.0 introduced the concept of anonymous methods.

// Creation of delegate with an anonymous method
Display newDisplay = delegate(string msg)
{
    // Anonymous method body
    Console.WriteLine(msg);
};

// Invoke
newDisplay("Hello World using Anonymous methods.");

The difference here is quite apparent. Creation of a new method to act as a handler is not required; the delegate itself specifies what should be done when invoked. Anonymous methods reduce the coding overhead in instantiating delegates because you do not have to create a separate method. Specifying a code bolck instead of a delegate can be useful in situations where creation another method might seem an unnecessary overhead.

Some other points to remember:
  • There cannot be any jump statements like goto, break or continue inside the code block if the target is outside the block and vice versa.
  • The scope of parameters of an anonymous method is the anonymous code block.
  • No unsafe code can be used within the anonymous code block.
  • The local variables and parameters whose scope contains an anonymous method declaration are called outer variables of the anonymous method. 

In the following code the variable x is an outer variable.

int x = 0;
Display newDisplay = delegate(string msg)
{
    Console.WriteLine(msg + x);
};
  • An anonymous method cannot access the ref or out variables of an outer scope.
C# 3.0 introduces lambda expressions which are pretty much similar to anonymous methods but more expressive and concise. Applications written using .NET Framework version 3.5 and higher should make use of lambda expressions instead of anonymous methods. I will discuss lambda expressions in detail after I start with LINQ (Language Integrated Query) in my following posts.

Did you know the two uses of ‘default’?

Well, maybe you do and maybe you don’t. And I’m not trying to start a series of posts tagged “Did you know?”, I guess Sara Ford is much better at it.

You’ve used the default keyword in C# as many times as possible in switch statements to indicate the default operation to be carried out. That’s one place where you would normally use the default keyword.

You can also use the default keyword in generic code where it specifies the default value of the type parameter which is null for reference types and zero for value types. In generic programming it is always difficult to guess what default value to assign to a parameterized type T when you don’t know information such as whether T is a value/reference type; if it’s a value type, is it an elementary type or a struct? This is where the default keyword is used to find out the exact default value for the current type. Here is how it would be used:

class MyGenerics<T>
{
    public void MyMethod()
    {
        T myVar = default(T);
     
        //...
        //...
        int defaultValue = default(int);
    }
}

The default value for reference types is null, for value types is zero and for struct types each member is initialized with it’s default value depending on whether it is a value or reference type.

Conditional compilation in C#

Programmers need to debug, which sometimes requires identification of points in your program where a programmer would like to insert code that would help him/her to debug his/her code efficiently. A simple example might be inserting a Console.Writeline() call that prints out values or indicates completion (successful or unsuccessful) of the executed part. However, these lines can clutter up the code structure and also needs removal of the debugging code for the release of the entire software.
This overhead is taken care of by specialized methods in C# that help the programmer debug the code without the need to clean up his/her debugging code for the release phase. These methods which are used for debugging are called Conditional methods. The compiler identifies these marked methods and never includes them in the release build.
Well, that’s good enough. But, how do I make a method conditional? .NET provides an attribute System.Diagnostics.ConditionalAttribute (alias Conditional) to achieve this. Let’s look at some code now.
 
Defining the conditional method
public class MyTracer   
{    
    [Conditional("DEBUG")]    
    public static void LogThisMessage(string myMessage, int severity)    
    {    
        //  Write message to screen or a file or ...    
        Console.WriteLine(" DEBUG MESSAGE : " + myMessage +" SEVERITY LEVEL : "+severity);    
    }    
}
 
The Conditional attribute has been applied to the LogThisMessage() method with the DEBUG conditional compilation symbol. This signals the compiler that the method should be ignored if the conditional symbol DEBUG is not specified.
 
Using the conditional method
With the Solution Configurations set to Debug mode if you execute the following code, you would happily see the output window shown below.
 
public static void Main(string[] args)    
{    
    //  Some error occured in my code    
    //  Log the message    
    MyTracer.LogThisMessage("The error 1221 has occured.", 5);    
}

Now lets see what happens to our conditional code in the Release mode. To enable Debug or Release mode you have to look for the Solution Configurations selector in your Visual Studio IDE which is shown below.

Change the mode to “Release” and you would see that the conditional code now has disappeared.
 
We used the predefined DEBUG compilation symbol in our code. You can have your own defined symbols and use them for conditional compilation. You can define custom compilation symbols in Project Properties -> Build tab -> General. Checkboxes have been provided to enable or disable the DEBUG and TRACE symbols.
 
.NET also provides two classes that provide similar functionality:
System.Diagnostics.Debug
System.Diagnostics.Trace
These classes contain methods that can also be used for debugging if you do not need more specific custom methods.

Covariance and Contravariance in delegates

If you’ve used delegates while programming you probably know about covariance and contravariance in delegates that provide a degree of flexibility when you match method signatures with delegate types.
 
Covariance
Covariance allows a method to have a more derived return type than what is specified in the delegate. This is just another view of polymorphism where you would specify a more generalized delegate that could be used by the objects deriving from the specified type.
 
The following example shows an example where covariance comes into picture.
 
class Fruit    
{    
}    
  
class Apple : Fruit    
{    
}    
  
class CovarianceExample    
{    
    public delegate Fruit CreateFruit();    
  
    public static Fruit CreateGenericFruit()    
    {    
        return new Fruit();    
    }    
  
    public static Apple CreateApple()    
    {    
        return new Apple();    
    }    
  
    public static void Main(string[] args)    
    {    
        CreateFruit fruitCreator = CreateGenericFruit;    
       
        // This is also perfectly fine.    
        CreateFruit appleCreator = CreateApple;    
    }    
}    
  
The delegate CreateFruit defines a return type of type Fruitand the same is applied to a method having the type Apple as a return type.
 
Contravariance
Contravariance allows a method to have parameters that are less derived than what is specified in the delegate type.
 
class ContravarianceExample    
{    
    public delegate void ProcessFruit(Apple myFruit);    
  
    public static void Process(Fruit myFruit)    
    {    
        Console.WriteLine("Fruit Processed.");    
    }    
  
    public static void Main(string[] args)    
    {    
        ProcessFruit processor = Process;    
        
        // Example of contravariance    
        processor(new Apple());    
    }    
}    
 
The above method shows how a delegate is invoked with an argument that is more specialized than the one specified in the delegate.
 
While its not always necessary to remember these terms, it is usually easier to remember these simple concepts when we remember the terms. I knew this was possible but I only came across these terms while browsing through MSDN documentation lately.
 
Although I’m pretty sure almost everyone who’s worked with delegates has come across these terms at least once, I would like to hear from you if this post has filled two more words to your .NET vocab.

10 Things I do to code better

After being into the software development career for more than a year now, I feel I understand what is expected from a "Software Developer". There are a lot of things that need to be done, and they need to be done with care. I see everyday that a lot of developers around me take software development as a job-to-be-done and not as a passion. Anyways, I'm here to talk about what I, as a software developer do to write better code everyday. Here's the list:
  
Keeping my code clean
Clean code could be described as code that is easily readable, and at the same time quite understandable. Stuff like proper indentation of code and the related comments can make your code look delicious.
 
Writing comments
Clean code and better comments make everyone's life simpler. Good comments are those which describe the 'WHW' (what, how and why) you do a particular thing in your code, in the simplest possible way. I call it "The expected WHW". Explaining this shouldn't be like writing an essay. Comments should be highly precise.
 
Writing Pre-conditions
Your code doesn't always stay with you. Everyone in your team should understand the pre-conditions and context when invoking a particular method that you might have written. The method which is being called should clearly describe it's pre-condition. Pre-conditions can be something like "The Person object should be initialized by calling the Person.Init() method" or something like that. Here, if the object 'Person' wasn't initialized, it may result in errors when other developers use methods that use the 'Person' object. This step helps other developers to use your code effectively.
 
Loose coupling
I always use this design principle to stay on the safer side with regards to scalability and maintainability. Loose coupling enables you to define boundaries within your code structure, and helps make easier the testing of the code and handling later changes in the code. This is one thing a developer's architect or designer might have a say in, but as far as I am concerned, I do have the liberty to suggest better coupling structures, however, they are happily trashed if we foresee any problem.
 
Knowing your programming language well
I'm always stuck in this conquest of mastering the programming language I use for coding. I keep switching from C++ to C# and vice versa but, I don't really understand which one to choose as they both have their own supernatural powers. I love C# for its simplicity and it's always a pleasure to work with it on Visual Studio. I came across a guy (who has always been working in C++) lately who was using C# .Net and was doing some kind of string processing. He had written around 10 lines of code for splitting strings (the C++ way, character by  character) based on a few characters but, never did he know that there is already a method called String.Split() that was readily available for use. It is difficult to know everything in the very huge .Net Framework, but the more you know the better your life becomes.
 
Learning Tips and Tricks
I always try and learn new tips and tricks to use my development IDE (Visual Studio) in the best possible manner. These tips and tricks help you to perform faster. Consider using keyboard shortcuts rather than moving your hand to the mouse and then searching for the respective button to be clicked in the IDE. This reduces the time in interaction with the IDE and would give you more time to think about your code. I learn tips and tricks in coding i.e. how something could be done in the best possible way.
 
Using Source and Version control software effectively
Source and version control software sound like an unnecessary nightmare to a low-on-experience software developer. I've had great times with Rational ClearCase. It's annoying ways help me learn more about it. Although a merge always screws up my code, I love the concept of source and version control because it enables me to code without the fear of losing my previously written code by mistake. I'm still in the process of taming this wild animal called "ClearCase".
 
Following coding guidelines
I always try my best not to be cursed by my fellow developers for not following the coding guidelines. Adherence to these guidelines helps developers to understand the "The expected WHW" of your code. This in turn helps you keep the code clean.
 
Reviewing my code
I always perform a self review of the written code before sending it for a review to the other team members. Reviews help you identify defects in the code at an early stage. Defects might miss out ones attention but, it might get caught by someone who would look at the code in a different manner or it may be easier for someone who has been working on the same thing for a long time to point out the possible defects in the code.
 
Knowing the context
Last, but never the least, it is a must to understand the entire context of usage of your code especially when you are adding particular functionality to existing code. There's always a chance that your changes could affect and break other's code. Always consult the original developers who had written the earlier code and get your code reviewed from them to avoid unexpected and vexing results.
 
This is a compilation of some things that I think help me in learning to code better. Opinions may vary, and if you’re still reading, I would like to hear your opinion on this. What do you think are other things that developers need to take care of?

Dynamic bitsets not supported in C++ STL

It was annoying and frustrating to come across the fact that an essential feature was missing in the C++ STL library that I was using. I was writing code that would create a bit array of length ranging from 1 to anything like a thousand or say ten thousand. This needed to be dynamically allocated. So, if I said:
bitset* b;
...    
b = new bitset(number_of_bits); // somewhere in the program
This would create a bitset of "number_of_bits" bits. This would have been cool, but, this isn't the way bitsets are supposed to be used. The usage of bitsets is quite rigid. They are based on templates which need the size of the bit array at compile time which is done something like this.
bitset<1000> b;   
or
bitset<1000> *b = new bitset<1000>(); 
Now, how does that help me? What if I need more than 1000 bits someday? I can't just give it the highest possible constant value. Everything would fail one day if the question of scalability arises. I don't understand why the folks who developed STL didn't have this in mind. They did it for all the other data structures but, couldn't do the same for bitsets? Why? The question might sound strange but, the answer to this might even be more strange.

So, what's the solution?
Use Boost libraries - which have their implementation of dynamic_bitset. Damn! I can't use Boost as the firm that I work for doesn't want to.
Use vector<bool> and implement overload the bitwise operator to act on that. Well, that's as good as creating a new implementation for my own dynamic bitset.
 
So, I've decided that I'll create my own dynamic version of <bitset> as the guys at MSDN forums told me to.

Phishing in the name of Midwest Airlines

What happens when you receive a very polite email from an airline company which tells you that you have booked a ticket somewhere across the globe and your credit card has been charged with $690? This doesn't sound strange if you've really bought the ticket on your credit card. What happens when you know that you haven't?

This happened to my colleague recently. She received a mail from the phisher pretending to be the Midwest Airlines web service which thanked her for purchasing the ticket and informed her that her credit card account was charged with $690. Gosh! You should have seen the look on her face. I definitely can't describe it. It was a mixture of fear (the fear of losing $690, which is quite a large amount), confusion (the confusion of what should be done next) and curiosity (all said and done, she too is a techie, knows and is curious about this stuff). But it's kind of cool to study the behavior of people becoming  victims (or in this case, potential victims) of phishing.

She gave me a shout across the desk and asked what she should do next. I informed her not to delete the mail (as I needed it as a real phishing example for posting on my blog, cruel thinking!) and inform the information security folks about this problem. And, I shouldn't have believed her on that. She deleted the mail and dreams of including snapshots of that mail and the attachments were destroyed. Anyways, you can find the pattern of the mail and the attachment in this article on CyberInsecure.com.

The best part of it was when I asked her to forward the mail to me. She looked at me as if I was planning to learn phishing by using that Trojan as my tool. But, by the time I asked for it, the mail was long gone (the mail was a victim of the Shift+Del disaster).

The attachment contains contains an exe file named E-ticket_[number].doc.exe which is a Trojan horse that steals information, including keystrokes, from the infected Windows PC and transmits that data to a server hosted in Russia. Now, that is something to take note of. Almost a year ago, this Trojan ripped off more than 1.6 million customer records from Monster Worldwide Inc., the company that operates the popular Monster.com recruiting Web site.

Have you ever been phished?

ClearCase and My Uncontrolled Source

I had some really funny time working with the Rational ClearCase source control software yesterday.

I'm not a regular ClearCase guy. In fact, I hate source control softwares. They're always a pain until you realize it's power. I've been working on ClearCase for like, five months or something, but, I still don't feel comfortable with it, especially when it's installed with Visual Studio 2005.

Yesterday, I tried renaming a file from "abc.cpp" to "xyz.cpp". Some crap happened in there and BOOM!!, the file was gone. Nowhere to be seen neither in the Clearcase Explorer nor using Windows Explorer. My mouth was left open and my lungs deflated by the very thought of writing 1000 lines of code again. Where did the file go?? I don't know!

The only thing I could think about then was to search for it. But how? Not manually through each directory, of course. Pop! I opened up Windows Explorer Search(which was unbelievably slow, considering the fact the my files were stored on a "high-speed" processing server connected by a "high-speed" network). Was it an attack by some freaky terrorist trying to destroy my valuable work? Windows search disagreed to my thoughts. Results showed that there a file named "xyz.cpp.04ac136e421d4108b617d79bf2aec045" in a directory called "lost+found". Now, what does that mean? Was my file lost?? Probably, it was, which is in turn very very strange and no one likes such surprises.

Thanks anyways to ClearCase for preserving a copy of the file before it lost it and folks, remember, to take care of this when renaming files using Visual Studio which are managed by ClearCase. Did you have any such crazy experience?

Efficient XML processing

Nowadays, many developers deal with a lot of XML files everyday. These files can be anything ranging from uses in configuration, documentation, databases where they are used for data sharing, data transport or simplifying platform changes. These files can grow to a very large size and need to be processed in an optimized way.
For example, while reading a configuration file, the module that reads the XML, iteratively reads the XML tag for the current XML Element and decides what processing it then has to do. A relatively large XML file would then contain a lot of different XML Elements(differed by their tags) that need to be checked each time you encounter an XML Element.
A brute force algorithm ro achieve this would be to check for each XML tag by doing a string comparison using an if-else ladder. For now, your XML file contains just three tags - Config1, Config2 and Config3. Your code would look something like this:
  
    class Caller 
    { 
        public void Call(string inputValue) 
        { 
            // Using the if-else ladder 
            if(inputValue.Equals("Config1")) 
            { 
                Method1(); 
            } 
            else if(inputValue.Equals("Config2")) 
            { 
                Method3(); 
            } 
            else if(inputValue.Equals("Config3")) 
            { 
                Method2(); 
            } 
        } 
    } 
 
All works well. But, what if the number of XML tags that need to be handled grow each day. You will be handling "Config1" to "ConfigN" in the same way as you have did before - using the if-else ladder. And what if you have no control over the value of 'N'. That is when the processing time for each file increases and a need arises to check the efficiency of the code. String comparisons do take a lot of time and having so many string comparisons can ruin your code in terms of efficiency, maintablility and scalability.

If you try to visualize the above code in terms of a map, you would see that:

"Config1" maps to Method1()
"Config2" maps to Method2()
and so on...

Here's when you know that it would be useful to modify your code to utilize the Hashtable class. Initialize the Hashtable object to store the <key, value> pair as <string, MethodHandlerDelegate> where the string object represents the XML tag input such as "Config1", "Config2" and so on... and the MethodHandlerDelegate is a delegate type that references the method that needs to be called. This can be done using an initializer method such as:

        private delegate void MethodHandler(); 
        private SortedDictionary<string, MethodHandler> stringToDelegateDict; 
  
        public Caller() 
        { 
            stringToDelegateDict = new SortedDictionary<string, MethodHandler>(); 
        } 
  
        public void Initialize() 
        { 
            MethodHandler handler1 = new MethodHandler(Method1); 
            MethodHandler handler2 = new MethodHandler(Method2); 
            MethodHandler handler3 = new MethodHandler(Method3); 
            AddHandler ("Config1", handler1); 
            AddHandler ("Config2", handler2); 
            AddHandler ("Config3", handler3); 
            // All the handler methods are initialized here in the SortedDictionary. 
        } 

Note that we use a SortedDictionary object here since we are aware of the types of the key and value that might be inserted into the Dictionary. This saves us from doing any unnecessary downcasting which we would need to do if we used a Hashtable.

Using this approach, it can also be decided at runtime that which of the handlers need be present in the dictionary by using the following methods that add or remove a handler from the dictionary.

Now, you can subscribe the handlers only when they will be needed.

       // Adding a new handler 
       public void AddHandler(string inputValue, MethodHandler handler) 
       { 
           stringToDelegateDict.Add(inputValue, handler); 
       } 

       // Removing an existing handler 
       public void RemoveHandler(string inputValue) 
       { 
           stringToDelegateDict.Remove(inputValue); 
       } 
 
When you encounter an XML tag now, use just make the same call that you did earlier like this:

       Caller c = new Caller(); 
       c.Initialize(); 
       c.Call("Input1"); 
       c.Call("Input2"); 
  
However, you change your Call method to have the following implementation.
 
       public void Call(string inputValue) 
       { 
           // Get the delegate to the respective method. 
           MethodHandler mh = stringToDelegateDict[inputValue]; 
           // Make the call. 
           mh(); 
       } 

This definitely makes life simple while dealing with changes in the XML tags, adding handlers for new tags, removing handlers for existing tags and providing the runtime support for the same. Now, any changes occuring to the XML format would need changes in the Caller.Initialize method. Here, the method handlers act as subscribers that can dynamically subscribe/unsubscribe for a particular event.

Now, what if your design says that it needs to call both Method1 and Method2 to handle the tag “Config1”? In case you were using the OrdinaryCaller design, you would handle it something like this:

public void CallModified(string inputValue) 
       { 
           // Using the if-else ladder 
           if (inputValue.Equals("Config1")) 
           { 
               Handler.Method1(); 
               Handler.Method2(); 
           } 
           else if (inputValue.Equals("Config2")) 
           { 
               Handler.Method2(); 
           } 
           else if (inputValue.Equals("Config3")) 
           { 
               // The OrdinaryCaller.Call method needs to be changed like this. 
               Handler.Method3(); 
           } 
       } 

The problem with doing this is that you need to keep changing the code of the Call() method and also the main disadvantage is that you cannot change the behaviour of the Call() method at runtime.

Here is where the concept of multicast delegates comes into picture. You slightly modify the AddHandler() method to support multicast delegates. This let you have any number of delegate methods as handlers of each tag.

// Adding a new handler 
       public void AddHandler(string inputValue, MethodHandler handler) 
       { 
           // Check if the string key is already available... 
           if (stringToDelegateDict.ContainsKey(inputValue)) 
           { 
               // If yes, create a multicast delegate 
               // and place it back there with the same key. 
               MethodHandler temp = stringToDelegateDict[inputValue]; 
               temp += handler; 
               stringToDelegateDict.Remove(inputValue); 
               stringToDelegateDict[inputValue] = temp; 
           } 
           else 
           { 
               // else, simply add the handler. 
               stringToDelegateDict.Add(inputValue, handler); 
           } 
       } 

Just call the AddHandler() method to subscribe a new handler and voila! You’ve got another handler working right where you wanted it.

I felt that the above mentioned technique is just a better way of handling large XML files. I’m sure more cleaner options exist to do the same which I’m not aware of. What do you think? What can be a better and more efficient technique to do this?